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Recommended Books and Resources 


This lecture course covers three topics: kinetic theory, stochastic processes and linear 
response. Most decent books on statistical mechanics will have a section covering non- 
equilibrium topics in general. However, if you’re looking for details beyond the basics, 
you'll probably need a different book for each topic. Some good general purpose books 
are: 


e Huang, Statistical Mechanics 
e Kardar, Statistical Physics of Particles 
e Reif, Fundamentals of Statistical and Thermal Physics 


Both Huang and Kardar treat kinetic theory and the Boltzmann equation before they 
move onto statistical mechanics. Much of Section 2 of these notes follows the path laid 
down in the these books. Reif ends with a much wider ranging discussion of kinetic 
theory, transport and stochastic processes. 


For more details on kinetic theory: 
e Chapman and Cowling, The Mathematical Theory of Non-Uniform Gases 
e Lifshitz and Pitaevskii, Physical Kinetics 


Both of these are old school. The first was published in 1939 although the latest edition, 
written in 1970, is modern enough to cover all the developments that we touch upon 
in this course. The last volume of the course by Landau and Lifshitz covers kinetic 
theory. This book was written substantially later than the earlier volumes, decades 
after Landau’s death. 


For more details on stochastic processes: 
e Van Kampen, Stochastic Processes in Physics and Chemistry 


The topic of linear response is usually covered in books on many body theory or more 
general condensed matter. Two excellent modern books, both with a chapter on re- 
sponse theory, are 


e Altland and Simons, Condensed Matter Field Theory 
e Chaikin and Lubensky, Principles of Condensed Matter Physics 


Finally, there are a number of good lecture notes and resources on the web, collated at 
http://www.damtp.cam.ac.uk/user/tong/kinetic.html 
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1. Things Bumping Into Other Things 


1.1 Introduction 


The purpose of this course is to describe a number of basic topics in non-equilibrium 
statistical mechanics. 


If you’ve taken a first course in Statistical Mechanics, you’ll know that the whole 
machinery of ensembles and partition functions only works when applied to systems 
in equilibrium. Equilibrium is defined to be a state in which, at least on the coarse 
grained level, things don’t change. Of course, if you have a hot system and you look 
closely enough, everything is flying around on the atomic level. But if you focus only 
on macroscopic variables then, in equilibrium, all is calm. 


At first, the restriction to equilibrium sounds rather limiting. But it’s not. This is 
because the state of equilibrium is very special: if you take any system and wait long 
enough then it will eventually relax down to equilibrium. (This is sometimes said to 
be the —1' law of thermodynamics). 


Of course, this begs the question of why equilibrium is special. Why do all systems 
eventually reach this state. How do they approach this state? How does such irre- 
versible behaviour arise from the fundamental laws of physics which are, for all intents 
and purposes, invariant under time reversal? Moreover, what if you’re not happy to 
just sit back and watch an equilibrium system? Suppose you want to stir it or splash 
it or attach a couple of crocodile clips and zap it. How will it respond? These are the 
kind of questions that we will begin to answer in this course. 


While there is typically only a single equilibrium state, for a system with 10° par- 
ticles, there are many many ways to be out-of-equilibrium. Most of these states are 
uninteresting in the sense that they will be so complicated that no general features will 
emerge. Moreover, such states will be fleeting, rapidly changing to another complicated 
configuration. If we’re to have any chance of making progress, we need to be careful 
about the kind of states we discuss and the kind of questions that we ask. We would 
like to identify features in the dynamics of 107° particles that persist for long periods 
of time. We will see that such features arise for systems that are close to equilibrium. 
Indeed, throughout this course, the dramatic sounding “non-equilibrium” will really 
mean “almost-equilibrium” . 


Each of the four sections in these lecture notes can be read more or less independently. 
In the rest of this introductory section, we will introduce a few basic tools to describe 
how quantities change in a gas. This will really be a baby version of kinetic theory, with 


nothing more sophisticated than Newtonian thinking applied to a bunch of billiard balls. 
But it will allow us to develop some basic intuition for the rudiments of the subject. 
While many of the formulae we derive in this section are rather heuristic, all will be 
revisited Section 2 where we use the Boltzmann equation to give a more rigorous view 
on the subject, understanding transport phenomena and deriving the equations of fluid 
mechanics starting from first principles. Section 3 introduces the subject of random 
jittery motion, usually called stochastic processes. Finally, in Section 4 we turn the 
stir-it-splash-it-zap-it question and develop the machinery necessary to describe how 
systems respond when prodded. 


1.2 Basics of Collisions 


Let’s start by considering N molecules in a gas of volume V. We will begin by ignoring 
all interactions between particles. Instead, we will treat the molecules as spheres of a 
finite size which will allow collisions to occur. For the most part, we won’t rely on the 
results of earlier courses on statistical mechanics. There is, however, one exception: in 
the rest of this section, we will need the Maxwell-Boltzmann probability distribution 
for the velocities in a gast. 
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The distribution f(v)d*v is the probability that a molecule has velocity within a small 
volume d?v in the neighbourhood of v. 


We denote the diameter of the particle as d. Obviously its radius is d/2. Viewed 
head on, the particle appears as a disc with area 7(d/2)?. However, more relevant for 
our purposes is the effective cross-sectional area of the particle, td?. To see why this is, 
focus on a single particle as it makes its way through the gas. If it travels a distance J, 
it will sweep out a volume 7d? as shown in Figure 1 and collide with any other particle 
whose centre lies within this volume. 


The mean free path is defined to be the average distance travelled by the molecule 
between each collision. This is given by 7d?] = V/N, or 
V 1 1 


~ Nad nid (1.2) 
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where n = N/V is the particle density. 


1This result will be re-derived in Section 2 when we discuss the Boltzmann equation. You can also 
find a simple derivation in the lectures on Statistical Physics. 
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Figure 1: A particle of radius d/2 travels, on average, a length | between each collision. In 
this time it sweeps out a volume 7d?l. 


In what follows, we’ll assume that our gas is dilute, meaning / >> d. For typical gases 
d~ 1071m while, at atmospheric pressure, | ~ 1077 m. 
1.2.1 Relaxation Time 


The average time between collisions is called the scattering time or relaxation time, 
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You might think that v,e, is the average speed of a given particle. This isn’t quite true. 

Since we’re interested in the rate of collisions, the speed of other particles approaching 

is just as important as the speed of the particle you’re looking at. So we should take 

Vre] to be the average relative speed of the molecules. For two particles with velocities 
v and wv’, the average relative speed is 

w= (0-0) = fas f ao’ E-o) 

= (v") + (u!?) — 20. a") (1.3) 


where f(v) in the first line is the Maxwell-Boltzmann distribution (1.1). The fact that 
we have multiplied the distributions f(v)f(v’) together in the first line means that 
we are assuming that the velocities of the two particles are uncorrelated. This is an 
assumption that we shall return to in Section 2. 


The last term in (1.3) vanishes: (v- v") = 0. This follows on rotational grounds. 
Because the velocity of each particle is independent, it’s enough to know that the 
average velocity (not speed!) in, say, the x-direction vanishes: (v,) = 0. Meanwhile, 
(v2) = (v'?) which means that 02, = 2(v). It is a simple exercise to compute (v?) from 
the Maxwell-Boltzmann distribution (1.1) and the answer is the same as you would get 


by simply appealing to the equipartition of energy: (v?) = 3kgT/m. We have 
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and the relaxation time is given by 


B 1 m 
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Notice that as the temperature drops, the mean free path remains unchanged. However, 
the time between collisions increases. 


There is a slightly different interpretation of the relaxation time that it is useful 
to have in hand. Suppose that the probability that a molecule undergoes a collision 
between time t and time t + dt is given by wdt, for some constant w, known as the 
collision rate. Notice that in stating this, we have made more assumptions about the 
nature of the collisions. In particular, the fact that w is a constant means that no 
memory of previous collisions is kept: the chances of being hit again are not affected 
just because you already were hit a short time ago. 


If P(t) is the probability that the molecule makes it to time t unharmed, then the 

probability that it further makes it to time t + dt without collision is 
P(t + dt) = P(t)(1 — wdt) 
Writing this as a differential equation, we have 
dP 

—=-wP => P(t)=e” 
a (t) 
where we’ve chosen the normalisation so that P(0) = 1 and P(co) = 0. With this 
in hand, we can compute the average time between collisions. But this is exactly the 
quantity that we called the relaxation time above. It is 


r= | P@at=— 


We learn that 1/7 is the collision rate. 


1.3 Basics of Transport 


We now turn to the question of how things move. Of course, in a thermal system, 
the microscopic constituents are always moving, even in equilibrium. Our goal here 
is to understand how certain macroscopic properties move when out of equilibrium. 
The properties that we will look at are all associated to a conserved quantity: particle 
number, energy or momentum. Processes in which these quantities change over time 
are usually referred to as transport. As we will see, all of these quantities typically flow 
in such a way as to reach the equilibrium state. 


1.3.1 Diffusion 


Drop a blob of ink into a glass of water. How does it spread? More generally, we are 
interested in the motion of a particular kind of particle — one with a nice colour, or 
funny smell — as it makes its way through a generic background of liquid or gas. The 
true dynamics of any particle is, as you might expect, somewhat jittery. Here we’ll look 
at a simple model that captures this physics. 


Random Walk 


Consider a lattice which, for now, we take to be one dimensional. The spacing between 
the lattice sites is set by the mean free path, l, and after a time, T, the particle jumps 
either left or right. The direction of the jump is entirely random: 50% of the time it 
goes left, 50% right. This model is known as a random walk. 


The particle starts at the origin and we want to know the probability P(x, t) that it 
sits at x = ml at time t = Nr. (Here m is an integer; it’s not the mass of the particle!). 
We'll start by giving a simple combinatoric derivation of the answer. For simplicity, 
we'll take N to be even and we’ll look at m < N. To get to x = ml, the particle must 
have made $(N +m) forward jumps and $(N — m) backwards jumps. The probability 
is just the number of different ways we can do this, divided by 2%, the total number of 
possible combinations. 
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where, in the second step, the factorials have been replaced by Stirling’s approximation 


and we’ve also expanded to leading order in m/N. (To get the prefactor, we need to 
go to the third order in the Stirling expansion). 


The probability distribution of the particle is an ever-spreading Gaussian ensemble. 
The mean is simply (x) = 0, reflecting the fact that the particle is equally likely to 
travel forwards as backwards. The variance is 

a, P 
= —t 1.5 
(2%) == (1.5) 
The root-mean-square (rms) distance travelled by the particle grows as \/(x?) ~ vt. 
This is characteristic behaviour of random walks. 


It is simple to repeat our analysis of the random walk to three dimensions. For a 
cubic lattice, we assume that the motion in each of the directions is independent and 
equally likely. On average, the particle moves in the x-direction only every 37, so (1.5) 


should be replaced by (x?) = [?t/37. But this means that the total rms distance covered 
remains unchanged 
Ê 


(Z) = (x°) + °) + (2°) = t 


The Diffusion Equation 


We can recast the above discussion in terms of a differential equation for the density 
of particles, n = N/V. Away from equilibrium, the density is not a constant. It is, in 
general, a function of time and space. We expect any gradient, Vn, in the density of 
particles to lead to a flow, from the high density region to the low. 


We’ll again restrict first to the case of one-dimension. Consider the density at some 
fixed time: n = n(x,t). We’d like to derive an expression for the density at the point 
x a short time At later. Of course, some particles will leave, but others will come in 
to replace them. Any particle which is at x at time t + At must have been sitting at 
some other position x — Az at time t. Here Az should be viewed as a random variable 
since some move one way, some the other. This means that we can write an expression 
for the density at time t + At as an average over all the different Az, 


n(t + At, x) = (n(t,x — Azx)) 
On 18n, > 
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The term with the first order derivative vanishes because, on average, particles are 


= n(t,x) — 


equally likely to go either way, meaning (Az) = 0. Taylor expanding the left-hand- 
side, we arrive at the diffusion equation 
ðn 8n 
ðt Aa? 
where the diffusion constant is D = (Ax?) /2At. We expect this to be related to our 
two quantities, the mean free path / and scattering time 7. On dimensional grounds, 
we must have 
PPRA 
z 
Solutions to the diffusion equation evolve so as to iron out any inhomogeneities in 
particle density. As an example, suppose that all N particles start out life sitting at 
the origin, giving us the initial condition n(z,t = 0) = Nod(x). The solution to the 
diffusion equation with this initial condition is an ever-spreading Gaussian, 


ime 
n(x, t) =N ADE /4Dt 


This reproduces the discretised result (1.4). Viewing the average distance travelled as 
the width of the cloud of particles, we again have the result 


(a?) = 2Dt 


It is simple to extend the derivation above to three dimensions. Going through the 
same steps, we now find the 3d diffusion equation, 

On 
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This is also known as Fick’s (second) law. We again expect that D ~ I?/r. (Although 
the overall numerical factor is not necessarily the same as the 1d case. In fact, in simple 
analysis it is a factor of 3 less). The Gaussian again provides a solution, now with 


(#7) = 6Dt 


As we will now show, a number of other processes also follow this general diffusive 
form. 


1.3.2 Viscosity 


Viscosity is a form of internal friction experienced by 
a fluid. It can be measured by placing a fluid between 
two plates, a distance d apart in the z direction. Holding 
the lower plate stationary, the top plate is moved at a 
constant speed, u, in the x direction. But you don’t get 


to do this for free: the fluid pushes back. If you want to 
keep the plate moving at a constant speed, you have to Figure 2: 
apply a force F. 


Near the upper plate, a friction force causes the fluid to be dragged along with the 
same speed u. However, near the lower plate, the fluid remains stationary. This sets up 
a velocity gradient, u,(z), with u,(d) = u and u,(0) = 0. Experimentally, it is found 
that the force per unit area which must be exerted on the upper plate is proportional 
to this velocity gradient, 


F dur u 
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(1.6) 


where the second equality holds for small distances d. The coefficient of proportionality, 
n, is called the viscosity. (Or, more correctly, the dynamic viscosity). 


We would like to derive both the force law (1.6) and the viscosity 7 from first princi- 
ples. It’s simple to get an intuition for what’s happening on the atomic level: when the 
molecules collide with the upper plate, they pick up some z-momentum. They then 
collide with other molecules lower down, imparting some of this z-momentum to new 
molecules, which then collide with other molecules lower down, and so on. In this way, 
we set up the velocity gradient in the z direction. 


We'll think of a slab of gas at some fixed value of z. To figure out the force acting on 
this slab, we need to work out two things: the number of particles moving through the 
slab per unit of time; and the extra momentum in the x-direction that each particle 
imparts to the molecules in the slab. 


Let’s first deal with the number of particles. The density of particles in the fluid is 
n = N/V. How many of these pass through a slab in the z-direction in a given length 
of time depends on how fast they’re travelling in the z-direction (obv!). But we know 
how many particles there are with each speed: this is given by the Maxwell-Boltzmann 
distribution (1.1). The net result is that the number of particles, per unit time, per 
unit area, whose velocity is lies close to ọ (in a box of size dv), passing through a 
horizontal slab is 


# of particles per unit time per unit area = nv, f(v) d?v (1.7) 


Now let’s figure out the momentum that each of these molecules imparts. Consider a 
particle at some position z. It gets hit from below, it gets hit from above. The hits 
from above are likely to give it more momentum in the x direction; those from below, 
less. Let’s consider those ariving from above. If they arrive from a position z + Az, 
then they impart z-momentum 


dux 
Ap = m(us(z + Az) — uz(2)) =% m z Az (1.8) 
z 
What is the distance Az here? Well, this depends 
on the angle the particles come in at. They have S 
travelled the mean free path l, so if they arrive at N l cosa 
angle 0 then we must have | 
Figure 3: 


Az =Icos@ 


Here 6 € [0, 7/2) for particles arriving from above. But the same argument also holds 
for particles coming in from below. These have 0 € (7/2,7] and, correspondingly, 


Az < 0 which, from (1.8), tells us that these particles typically absorb momentum 
from the layer at z. 


Our goal is to work out the force per unit area acting on any z slice. This is given 
by the rate of change of momentum 


F  1Ap 

A AA 
where the minus sign arises because F defined in (1.6) is the force you need to apply to 
keep the flow moving (while Ap/At is the force of the fluid pushing back). The rate of 
change of momentum per unit area is simply the product of our two expressions (1.7) 
and (1.8). We have 


L -n f d's Apv, f(v) 
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We've actually done something sly in this second line which is not really justified. 
We’re assuming that the fluid has an average velocity (vz) = Ux in the x-direction. 
Yet, at the same time we’ve used the Maxwell-Boltzmann distribution for the velocity 
of the particles which has (v,) = 0. Presumably this is not too bad if the speed of the 
flow u < (v), the average speed of the particles in the fluid, but we really should be 
more careful in quantifying this. Nonetheless, the spirit of this section is just to get a 
heuristic feel for the physics, so let’s push on regardless. Writing the velocity integral 
in polar coordinates, we have 


F du 2 ü 27 m 3/2 maa 
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At this stage we can trivially do the f dọ integral and J d cos? 6 sin = 2/3. We’re 
left with 


F mnldu, m 3/2 2 
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But the integral f dv is simply the expression for the average speed (v) in the gas. We 
have our final expression, 


Comparing with (1.6), our expression for the viscosity is 


n= mnll) (1.11) 
There is something surprising about the viscosity: it is independent of the density 
n = N/V of the gas. At first sight that looks like a wrong statement because, obvi- 
ously, there is a factor of n sitting in (1.11). But remember that the mean free path 
depends inversely on the density, l ~ 1/n, as we can see from (1.2). The fact that the 
viscosity does not depend on the fluid density is rather counterintuitive. You might 
think that denser gasses should be more viscous. But the derivation above provides 
the explanation for this behaviour: if you halve the density, there are half as many 
molecules moving down. But each travels twice as far and therefore imparts twice the 
momentum kick Ap when they finally hit. 


The expression (1.11) holds a special place in the history of physics. It was first 
derived by Maxwell and is arguably the first truly novel prediction that was made 
using kinetic theory, providing important evidence for the existence of atoms which, at 
the time, were not universally believed. Indeed, Maxwell himself was surprised by the 
fact that 7 is independent of the density of the gas, writing at the time 


“Such a consequence of the mathematical theory is very startling and the 
only experiment I have met with on the subject does not seem to confirm 
it. 


Maxwell rose to the challenge, building the apparatus and performing the experiment 
that confirmed his own prediction. 


1.3.3 Thermal Conductivity 


The next transport process we will look at is the conduction of heat. Place a fluid 
between two plates, each held at a different temperature. Empirically, one finds a flow 
of energy in the fluid. This is described by the heat flow vector, g, defined by the energy 
per unit time passing through a unit area (which is perpendicular to g). Empirically, 
the flow of heat is proportional to the temperature gradient, 


g=—-KVT (1.12) 


where « is called the thermal conductivity. Once again, we would like to derive both 
this empirical law, as well as an expression for ~. 
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Our calculation follows the same path that we took to determine the viscosity. Let’s 
set up a temperature gradient in the z-direction. The number of particles with velocity 
v that pass through a slab at position z per unit time per unit area is again given 
by (1.7). We’ll use equipartition and assume that the average energy of a particle at 
position z is given by 


p= Š kaT(2) 


We also need to know how particles deposit or gain energy when they reach the slab. If 
a particle came from a hot place with temperature T(z + Az), we’ll assume the particle 
deposits the difference in energy. Similarly, if the particle arrives from a colder place, 
we'll assume it absorbs the difference. This means 
AE = E(z + Az) — E(z) = T 
2 dz 
Recall that the height Az from which the particle arrives depends on both the mean 


free path and the angle at which it comes in: Az = l cos 6. 


As in the derivation of the viscosity, there is something a little dodgy in what we’ve 
written above. We’ve equated the energy deposited or gained by a particle with the 
average energy. But this energy transfer will certainly depend on the velocity of the 
particle and which is dictated by the Maxwell-Boltzmann distribution in (1.7). As in 
the derivation of the viscosity, we will simply ignore this fact and proceed. We’ll do 
better in the next section. 


Modulo the concerns above, we now have enough information to compute the heat 
flow. It is 


lq| = n fae AEv, f(v) 


Doing the integrals f d*v using the same steps that took us from (1.9) to (1.10), we 
derive the law of heat flow (1.12) 


1 dT 
la| = ~akanl{y) 


The thermal conductivity is the proportionality constant. It is usually expressed in 
terms of the specific heat, cy, of the ideal gas 


1 
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1.3.4 Conservation Means Diffusion 


Thermal conductivity is all about the transport of energy; viscosity is about the trans- 
port of momentum. But both energy and momentum have a very special property: 
they are conserved. 


What’s more, because physics is local, we can make a stronger statement than just 
“the total energy doesn’t change”. If the energy in some region of space, E(x), changes 
then it must show up in a neighbouring region of space. But that’s exactly what the 
heat flow q¢ is telling us: how energy is moving from one point to the next. This local 
conservation law is captured by the equation. 

dE 


Once again equating energy with the thermal energy, E(x) = 3kpT (Z), the continuity 


equation reads 
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This is the heat equation. It tells us that any inhomogeneities in temperature are 
smoothed out through diffusion with diffusion constant D = k/cy = 3l(v) ~ l/r. 


There is a similar story for momentum, pê where i = 1, 2,3 labels the three directions 
of space. The continuity equation reads 


dp’ 4 OP# B 
dt Əri 


where P is the pressure tensor which describes the flux of i-momentum in the j- 
direction. 


But looking back at our derivation of the viscosity in Section 1.3.2, this is precisely 
what we equated to the force F/A: the flux of z-momentum in the z-direction. (Ac- 
tually there’s an extra minus sign that follows from our previous definition of F). 
Combining the continuity equation with our earlier expression for the viscosity, we find 

dp” duz duz 


= mn 


dt a Tie 


where, as in Section 1.3.2, we’ve restricted to situations with no velocity gradients in the 
x and y directions. The result is once again a diffusion equation, this time for gradients 
in velocity. And, once again, the diffusion constant given by D = n/mn = Hv) ~ EF fr. 


= [2:= 


We learn that all roads lead to diffusion. For any conserved quantity — whether par- 
ticle number, energy or momentum — any inhomogeneities in the system are smoothed 
away through the diffusion equation. 


The equations that we’ve written down in this final section are rather hand-waving 
and, in cases, missing some interesting physics. The proper equations are those of 
hydrodynamics. The goal of the next section is the do a better job in deriving these. 


—13- 


2. Kinetic Theory 


The purpose of this section is to lay down the foundations of kinetic theory, starting 
from the Hamiltonian description of 10? particles, and ending with the Navier-Stokes 
equation of fluid dynamics. Our main tool in this task will be the Boltzmann equation. 
This will allow us to provide derivations of the transport properties that we sketched in 
the previous section, but without the more egregious inconsistencies that crept into our 
previous attempt. But, perhaps more importantly, the Boltzmann equation will also 
shed light on the deep issue of how irreversibility arises from time-reversible classical 
mechanics. 


2.1 From Liouville to BBGKY 


Our starting point is simply the Hamiltonian dynamics for N identical point particles. 
Of course, as usual in statistical mechanics, here is N ridiculously large: N ~ O(10?°) 
or something similar. We will take the Hamiltonian to be of the form 


N N 
1 e -> ys oes 
Te p? + > VR) + UF -i (2.1) 
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The Hamiltonian contains an external force F = —VV that acts equally on all parti- 


cles. There are also two-body interactions between particles, captured by the potential 
energy U(r; — Tj). At some point in our analysis (around Section 2.2.3) we will need 
to assume that this potential is short-ranged, meaning that U(r) ~ 0 for r >> d where, 
as in the last Section, d is the atomic distance scale. 


Hamilton’s equations are 


Op, OH Or, OH 


(2.2) 
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Our interest in this section will be in the evolution of a probability distribution, 
f(T: Pi; t) over the 6N dimensional phase space. This function tells us the proba- 
bility that the system will be found in the vicinity of the point (7j,9;). As with all 
probabilities, the function is normalized as 


N 
fw ifp20S1 with dV = J [Prid p: 


i=1 


Furthermore, because probability is locally conserved, it must obey a continuity equa- 
tion: any change of probability in one part of phase space must be compensated by 


= A= 


a flow into neighbouring regions. But now we’re thinking in terms of phase space, 
the “V” term in the continuity equation includes both 0/0r; and O/Op; and, corre- 
spondingly, the velocity vector in phase space is (Fi, D;). The continuity equation of the 
probability distribution is then 


ata Eta Ga) =" 


where we're using the convention that we sum over the repeated index i = 1,..., N. 
But, using Hamilton’s equations (2.2), this becomes 
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This final equation is the Liouville’s equation. It is the statement that probability 
doesn’t change as you follow it along any trajectory in phase space, as is seen by 
writing the Liouville equation as a total derivative, 


af Of Of n 
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To get a feel for how probability distributions evolve, one often evokes the closely 


related Liouville’s theorem. 


This is the statement that if you follow some region of 
phase space under Hamiltonian evolution, then its shape can change but its volume 
remains the same. This means that the probability distribution on phase space acts 
like an incompressible fluid. Suppose, for example, that it’s a constant, f, over some 
region of phase space and zero everywhere else. Then the distribution can’t spread out 
over a larger volume, lowering its value. Instead, it must always be f over some region 


of phase space. The shape and position of this region can change, but not its volume. 


The Liouville equation is often written using the Poisson bracket, 


ðA ðB 0A OB 
Or, Op, Op, OF; 


{A, B} = 
With this notation, Liouville’s equation becomes simply 


Of 
g = UE FI 


2A fuller discussion of Hamiltonian mechanics and Liouville’s theorem can be found in the lectures 
on Classical Dynamics. 


—15- 


It’s worth making a few simple comments about these probability distributions. Firstly, 
an equilibrium distribution is one which has no explicit time dependence: 


Of | 

= 
which holds if {H, f} = 0. One way to satisfy this is if f is a function of H and the most 
famous example is the Boltzmann distribution, f ~ e~°%. However, notice that there 
is nothing (so-far!) within the Hamiltonian framework that requires the equilibrium 
distribution to be Boltzmann: any function that Poisson commutes with H will do the 
job. We’ll come back to this point in Section 2.2.2. 


0 


Suppose that we have some function, A(7r;,);), on phase space. The expectation 
value of this function is given by 


(4) = f av AGH) I Ea Pat (2.3) 


This expectation value changes with time only if there is explicit time dependence in 
the distribution. (For example, this means that in equilibrium (A) is constant). We 
have 


d(A) _ of 
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where we have integrated by parts to get to the last line, throwing away boundary 
terms which is justified in this context because f is normalized which ensures that we 
must have f — 0 in asymptotic parts of phase space. Finally, we learn that 


d(A) _ 
a f dV {A, H} f = ({A, H}) (2.5) 


This should be ringing some bells. The Poisson bracket notation makes these expres- 
sions for classical expectation values look very similar to quantum expectation values. 
2.1.1 The BBGKY Hierarchy 


Although we’re admitting some ignorance in our description of the system by consider- 
ing a probability distribution over N-particle phase space, this hasn’t really made our 
life any easier: we still have a function of ~ 107° variables. To proceed, the plan is 


— 16—- 


to limit our ambition. We’ll focus not on the probability distribution for all N parti- 
cles but instead on the one-particle distribution function. This captures the expected 
number of partcles lying at some point (7, p). It is defined by 


N 
fi (7, p; t) = N | [Jers F(T, T2, ia, TN, D, D2,- ONG) 
i=2 


Although we seem to have singled out the first particle for special treatment in the 
above expression, this isn’t really the case since all N of our particles are identical. 
This is also reflected in the factor M which sits out front which ensures that fı is 
normalized as 


fere fir, ot) =N (2.6) 


For many purposes, the function fı is all we really need to know about a system. In 
particular, it captures many of the properties that we met in the previous chapter. For 
example, the average density of particles in real space is simply 


n(rit) = f ap AF) (2.7) 
The average velocity of particles is 
agit) = f PERERA (2.8) 
and the energy flux is 
ft) = | Pp Lemna) (2.9) 


where we usually take E(p) = p?/2m. All of these quantities (or at least close relations) 
will be discussed in some detail in Section 2.4. 


Ideally we'd like to derive an equation governing fı. To see how it changes with time, 
we can simply calculate: 


hw [Tenn -N [Tera ostan f} 


Using the Hamiltonian given in (2.1), this becomes 


N 


MES 


Now, whenever j = 2,... N, we can always integrate by parts to move the derivatives 
away from f and onto the other terms. And, in each case, the result is simply zero 
because when the derivative is with respect to rj, the other terms depend only on p; 
and vice-versa. We’re left only with the terms that involve derivatives with respect 
to r; and pı because we can’t integrate these by parts. Let’s revert to our previous 
notation and call 7; = F and p} = p. We have 


3 = N > 3 
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where we have defined the one-particle Hamiltonian 
p 

H; = — r 2.11 
=F VE) (2.11) 


Notice that Hı includes the external force V acting on the particle, but it knows nothing 
about the interaction with the other particles. All of that information is included in 
the last term with U(7r—7,). We see that the evolution of the one-particle distribution 
function is described by a Liouville-like equation, together with an extra term. We 
write 


Oft ofi 


The first term is sometimes referred to as the streaming term. It tells you how the 
particles move in the absence of collisions. The second term, known as the collision 
integral, is given by the second term in (2.10). In fact, because all particles are the 
same, each of the (N — 1) terms in pail in (2.10) are identical and we can write 


ð a 
(E) -Nw -n f drd a fileren F, Fa., D, Bo,- .3t) 


But now we’ve got something of a problem. The collision integral can’t be expressed 
in terms of the one-particle distribution function. And that’s not really surprising. As 
the name suggests, the collision integral captures the interactions — or collisions — of 


one particle with another. Yet fı contains no information about where any of the other 
particles are in relation to the first. However some of that information is contained in 
the two-particle distribution function, 


N 
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With this definition, the collision integral is written simply as 


On) f a UF-R) ðk 
(4). = fa rod P2 ƏF op (2.13) 


The collision term doesn’t change the distribution of particles in space. This is captured 
by the particle density (2.7) which we get by simply integrating n = f d°pf,. But, after 
integrating over f d?p, we can perform an integrating by parts in the collision integral 
to see that it vanishes. In contrast, if we’re interested in the distribution of velocities — 
such as the current (2.8) or energy flux (2.9) — then the collision integral is important. 


The upshot of all of this is that if we want to know how the one-particle distribution 
function evolves, we also need to know something about the two-particle distribution 
function. But we can always figure out how fə evolves by repeating the same calculation 
that we did above for fı. It’s not hard to show that fọ evolves by a Liouville-like 
equation, but with a corrected term that depends on the three-particle distribution 
function f3. And fs evolves in a Liouville manner, but with a correction term that 
depends on f4, and so on. In general, the n-particle distribution function 


, 2s. es N! “ a 
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where the effective n-body Hamiltonian includes the external force and any interactions 
between the n particles but neglects interactions with any particles outside of this set, 


n >2 
Pi ~ 2 S 
Ha= > (Z +ve)+ > UF) 
i=l 


i<jcn 


The equations (2.14) are known as the BBGKY hierarchy. (The initials stand for 
Bogoliubov, Born, Green, Kirkwood and Yvon). They are telling us that any group 
of n particles evolves in a Hamiltonian fashion, corrected by interactions with one of 
the particles outside that group. At first glance, it means that there’s no free lunch; 
if we want to understand everything in detail, then we’re going to have to calculate 
everything. We started with the Liouville equation governing a complicated function 
f of N ~ O(1073) variables and it looks like all we’ve done is replace it with O(10?°) 
coupled equations. 
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However, there is an advantage is working with the hierarchy of equations (2.14) 
because they isolate the interesting, simple variables, namely fı and other lower fn. 
This means that the equations are in a form that is ripe to start implementing various 
approximations. Given a particular problem, we can decide which terms are important 
and, ideally, which terms are so small that they can be ignored, truncating the hierarchy 
to something manageable. Exactly how you do this depends on the problem at hand. 
Here we explain the simplest, and most useful, of these truncations: the Boltzmann 
equation. 


2.2 The Boltzmann Equation 


“Elegance is for tailors” 
Ludwig Boltzmann 


In this section, we explain how to write down a closed equation for fı alone. This 
will be the famous Boltzmann equation. The main idea that we will use is that there 
are two time scales in the problem. One is the time between collisions, 7, known as 
the scattering time or relaxation time. The second is the collision time, Teon, which 
is roughly the time it takes for the process of collision between particles to occur. In 
situations where 


T > Teoll (2.15) 


we should expect that, for much of the time, fı simply follows its Hamiltonian evolution 
with occasional perturbations by the collisions. This, for example, is what happens for 
the dilute gas. And this is the regime we will work in from now on. 


At this stage, there is a right way and a less-right way to proceed. The right way is 
to derive the Boltzmann equation starting from the BBGKY hierarchy. And we will do 
this in Section 2.2.3. However, as we shall see, it’s a little fiddly. So instead we'll start 
by taking the less-right option which has the advantage of getting the same answer 
but in a much easier fashion. This option is to simply guess what form the Boltzmann 
equation has to take. 


2.2.1 Motivating the Boltzmann Equation 
We’ve already caught our first glimpse of the Boltzmann equation in (2.12), 


Ofi _ Of 
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But, of course, we don’t yet have an expression for the collision integral in terms of 
fı- It’s clear from the definition (2.13) that the second term represents the change in 
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momenta due to two-particle scattering. When T > Teon, the collisions occur occasion- 
ally, but abruptly. The collision integral should reflect the rate at which these collisions 
occur. 


Suppose that our particle sits at (7,p) in phase space and collides with another 
particle at (7, p2). Note that we’re assuming here that collisions are local in space so 
that the two particles sit at the same point. These particles can collide and emerge 
with momenta p; and p}. We’ll define the rate for this process to occur to be 


Rate = w(p, P2|P1, Po) fol, T, D, Pa) pod? p d° pi, (2.17) 


(Here we’ve dropped the explicit t dependence of fo only to keep the notation down). 
The scattering function w contains the information about the dynamics of the process. 
It looks as if this is a new quantity which we’ve introduced into the game. But, using 
standard classical mechanics techniques, one can compute w for a given inter-atomic 
potential U(r). (It is related to the differential cross-section; we will explain how to 
do this when we do things better in Section 2.2.3). For now, note that the rate is 
proportional to the two-body distribution function f> since this tells us the chance that 
two particles originally sit in (r, p) and (r, p2). 


We'd like to focus on the distribution of particles with some specified momentum 
p. Two particles with momenta p and pọ can be transformed in two particles with 
momenta p; and p}. Since both momenta and energy are conserved in the collision, we 
have 


P+ p =p, +p, (2.18) 
P +t =p +p? (2.19) 


There is actually an assumption that is hiding in these equations. In general, we’re 
considering particles in an external potential V. This provides a force on the particles 
which, in principle, could mean that the momentum and kinetic energy of the particles 
is not the same before and after the collision. To eliminate this possibility, we will 
assume that the potential only varies appreciably over macroscopic distance scales, so 
that it can be neglected on the scale of atomic collisions. This, of course, is entirely 
reasonable for most external potentials such as gravity or electric fields. Then (2.18) 
and (2.19) continue to hold. 


While collisions can deflect particles out of a state with momentum p and into a 
different momentum, they can also deflect particles into a state with momentum p. 


=e 


This suggests that the collision integral should contain two terms, 
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The first term captures scattering into the state p, the second scattering out of the 
state p. 


The scattering function obeys a few simple requirements. Firstly, it is only non- 
vanishing for scattering events that obey the conservation of momentum (2.18) and 
energy (2.19). Moreover, the discrete symmetries of spacetime also give us some im- 
portant information. Under time reversal, p — —p and, of course, what was coming in 
is now going out. This means that any scattering which is invariant under time reversal 
(which is more or less anything of interest) must obey 


w(p, Pol Pi, P>) z w(—P, — pa] ~ P, —Pp2) 


Furthermore, under parity (r, p) + (—7, —p). So any scattering process which is parity 
invariant further obeys 


w(D, P2\P1, Po) E w(—p, —pə| = Pi, — pz) 


The combination of these two means that the scattering rate is invariant under exchange 
of ingoing and outgoing momenta, 


w(P, Popi, P2) = W(P, P2|P, P2) (2.20) 


(There is actually a further assumption of translational invariance here, since the scat- 


tering rate at position —7 should be equivalent to the scattering rate at position +r) 


The symmetry property (2.20) allows us to simplify the collision integral to 
Of fe anes Ss ee er 5 
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To finish the derivation, we need to face up to our main goal of expressing the collision 
integral in terms of fı rather than f. We make the assumption that the velocities of 
two particles are uncorrelated, so that we can write 


f(r, T, P, P2) = Air, P) filr, po) (2.22) 


This assumption, which sometimes goes by the name of molecular chaos, seems innocu- 
ous enough. But actually it is far from innocent! To see why, let’s look more closely 


SI 


at what we’ve actually assumed. Looking at (2.21), we can see that we have taken the 
rate of collisions to be proportional to f2(7, 7, pı, p2) where pı and pz are the momenta 
of the particles before the collision. That means that if we substitute (2.22) into (2.21), 
we are really assuming that the velocities are uncorrelated before the collision. And 
that sounds quite reasonable: you could imagine that during the collision process, the 
velocities between two particles become correlated. But there is then a long time, 7, 
before one of these particles undergoes another collision. Moreover, this next collision 
is typically with a completely different particle and it seems entirely plausible that the 
velocity of this new particle has nothing to do with the velocity of the first. Nonethe- 
less, the fact that we’ve assumed that velocities are uncorrelated before the collision 
rather than after has, rather slyly, introduced an arrow of time into the game. And 
this has dramatic implications which we will see in Section 2.3 where we derive the 
H-theorem. 


Finally, we may write down a closed expression for the evolution of the one-particle 
distribution function given by 


Of, Of 
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with the collision integral 
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This is the Boltzmann equation. It’s not an easy equation to solve! It’s a differential 
equation on the left, an integral on the right, and non-linear. You may not be surprised 
to hear that exact solutions are not that easy to come by. We’ll see what we can do. 


2.2.2 Equilibrium and Detailed Balance 


Let’s start our exploration of the Boltzmann equation by revisiting the question of the 
equilibrium distribution obeying Of°4/Ot = 0. We already know that {f, Hı} = 0 if f 
is given by any function of the energy or, indeed any function that Poisson commutes 
with H. For clarity, let’s restrict to the case with vanishing external force, so V(r) = 0. 
Then, if we look at the Liouville equation alone, any function of momentum is an 
equilibrium distribution. But what about the contribution from the collision integral? 


One obvious way to make the collision integral vanish is to find a distribution which 
obeys the detailed balance condition, 
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In fact, it’s more useful to write this as 


log( fi (r, Bi) + log( fi (7, Pa) = log( Fi (r, p)) + log fi (F, p2)) (2.26) 


How can we ensure that this is true for all momenta? The momenta on the right are 
those before the collision; on the left they are those after the collision. From the form of 
(2.26), it’s clear that the sum of log f;* must be the same before and after the collision: 
in other words, this sum must be conserved during the collision. But we know what 
things are conserved during collisions: momentum and energy as shown in (2.18) and 
(2.19) respectively. This means that we should take 


log( fi (F, P)) = 8 (u — E(P) + ë- p’) (2.27) 


where E(p) = p?/2m for non-relativistic particles and u, 6 and @ are all constants. 
We’ll adjust the constant u to ensure that the overall normalization of fı obeys (2.6). 
Then, writing p = mv, we have 


3/2 
which reproduces the Maxwell-Boltzmann distribution if we identify 6 with the inverse 
temperature. Here wu allows for the possibility of an overall drift velocity. We learn 
that the addition of the collision term to the Liouville equation forces us to sit in the 
Boltzmann distribution at equilibrium. 


There is a comment to make here that will play an important role in Section 2.4. 
If we forget about the streaming term {Hj, fı} then there is a much larger class of 
solutions to the requirement of detailed balance (2.25). These solutions are again of 
the form (2.27), but now with the constants u, 2 and w@ promoted to functions of space 
and time. In other words, we can have 
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Such a distribution is not quite an equilibrium distribution, for while the collision 
integral in (2.23) vanishes, the streaming term does not. Nonetheless, distributions 
of this kind will prove to be important in what follows. They are said to be in local 
equilibrium, with the particle density, temperature and drift velocity varying over space. 


The Quantum Boltzmann Equation 


Our discussion above was entirely for classical particles and this will continue to be 
our focus for the remainder of this section. However, as a small aside let’s look at how 
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things change for quantum particles. We’ll keep the assumption of molecular chaos, 
so fo ~ fifi as in (2.22). The main difference occurs in the scattering rate (2.17) for 
scattering pı + P2 > Pi + ps which now becomes 


Rate = w(p, Po|P1, Po) fil) fi (pa) { Lae FDY {1 T FB} dp2d°p',d°p, 


The extra terms are in curly brackets. We pick the + sign for bosons and the — sign 
for fermions. The interpretation is particularly clear for fermions, where the number of 
particles in a given state can’t exceed one. Now it’s not enough to know the probability 
that initial state is filled. We also need to know that probability that the final state is 
free for the particle to scatter into: and that’s what the {1 — fı} factors are telling us. 


The remaining arguments go forward as before, resulting in the quantum Boltzmann 
equation 


(3) F J podpi dp, w(i PBP) [AOAN E AHL E fie} 
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To make contact with what we know, we can look again at the requirement for equi- 
librium. The condition of detailed balance now becomes 


a A Z ( ea (p) + e(a) ) 
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Which is again solved by relating each log to a linear combination of the energy and 


momentum. We find 


ik 
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which reproduces the Bose-Einstein and Fermi-Dirac distributions. 


2.2.3 A Better Derivation 


In Section (2.2.1), we derived an expression for the collision integral (2.24) using in- 
tuition for the scattering processes at play. But, of course, we have a mathematical 
expression for the collision integral in (2.13) involving the two-particle distribution 
function fo. In this section we will sketch how one can derive (2.24) from (2.13). This 
will help clarify some of the approximations that we need to use. At the same time, 
we will also review some basic classical mechanics that connects the scattering rate w 
to the inter-particle potential U(r). 
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We start by returning to the BBGKY hierarchy of equations. For simplicity, we’ll 
turn off the external potential V(r) = 0. We don’t lose very much in doing this because 
most of the interesting physics is concerned with the scattering of atoms off each other. 
The first two equations in the hierarchy are 


ð P O 3 3, OU(T1—T2) oh 
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In both of these equations, we’ve gathered the streaming terms on the left, leaving 

only the higher distribution function on the right. To keep things clean, we’ve sup- 

pressed the arguments of the distribution functions: they are fı = fı(rı, pı; t) and 

fo = fo(T1, T2, Pi, D2; t) and you can guess the arguments for fz. 


Our goal is to better understand the collision integral on the right-hand-side of (2.30). 
It seems reasonable to assume that when particles are far-separated, their distribution 
functions are uncorrelated. Here, “far separated” means that the distance between 
them is much farther than the atomic distance scale d over which the potential U(r) 
extends. We expect 


Fali, 72, Pi, Dot) > fal’, Pa t) fi (72, po3t) when |i- r| > d 


But, a glance at the right-hand-side of (2.30) tells us that this isn’t the regime of 
interest. Instead, f is integrated OU(r) /Or which varies significantly only over a region 
r <d. This means that we need to understand fọ when two particles get close to each 
other. 


We'll start by getting a feel for the order of magnitude of various terms in the 
hierarchy of equations. Dimensionally, each term in brackets in (2.30) and (2.31) is an 
inverse time scale. The terms involving the inter-atomic potential U(r) are associated 
to the collision time Teol- 


1 OU O 
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This is the time taken for a particle to cross the distance over which the potential U(r) 
varies which, for short range potentials, is comparable to the atomic distance scale, d, 
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itself and 
d 
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where Urei is the average relative speed between atoms. Our first approximation will be 
that this is the shortest time scale in the problem. This means that the terms involving 
OU /Or are typically the largest terms in the equations above and determine how fast 
the distribution functions change. 


With this in mind, we note that the equation for fı is special because it is the only 
one which does not include any collision terms on the left of the equation (i.e. in 
the Hamiltonian H,,). This means that the collision integral on the right-hand side 
of (2.30) will usually dominate the rate of change of fı. (Note, however, we’ll meet 
some important exceptions to this statement in Section 2.4). In contrast, the equation 
that governs fə has collision terms on the both the left and the right-hand sides. But, 
importantly, for dilute gases, the term on the right is much smaller than the term on 
the left. To see why this is, we need to compare the f3 term to the fọ term. If we were 
to integrate f3 over all space, we get 


[andr f =Nfo 


(where we’ve replaced (N — 2) ~ N in the above expression). However, the right- 
hand side of (2.31) is not integrated over all of space. Instead, it picks up a non-zero 
contribution over an atomic scale ~ d?. This means that the collision term on the 
right-hand-side of (2.31) is suppressed compared to the one on the left by a factor of 
Nd?/V where V is the volume of space. For gases that we live and breath every day, 
Nd?/V ~ 10-3 — 1074. We make use of this small number to truncate the hierarchy of 
equations and replace (2.31) with 
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This tells us that fə typically varies on a time scale of Teon and a length scale of d. 
Meanwhile, the variations of fı is governed by the right-hand-side of (2.30) which, by 
the same arguments that we just made, are smaller than the variations of fọ by a factor 
of Nd3/V. In other words, fı varies on the larger time scale 7. 


In fact, we can be a little more careful when we say that fo varies on a time scale 
Teo. We see that — as we would expect — only the relative position is affected by the 
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collision term. For this reason, it’s useful to change coordinate to the centre of mass 
and the relative positions of the two particles. We write 


3 1 z = = = 
R= rtr) > T=Mm—T72 
and similar for the momentum 
3 = = _ 1 _ _ 
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And we can think of fo = fa(R, r, P,P t). The distribution function will depend on the 
centre of mass variables R and P in some slow fashion, much as fı depends on position 
and momentum. In contrast, the dependence of fə on the relative coordinates 7 and 
p is much faster — these vary over the short distance scale and can change on a time 
scale of order Teol- 


Since the relative distributions in fọ vary much more quickly that fı, we’ll assume 
that fə reaches equilibrium and then feeds into the dynamics of fı. This means that, 
ignoring the slow variations in R and P, we will assume that Of» /Ot = 0 and replace 
(2.32) with the equilibrium condition 

(Z a = out) z) fo © 0 (2.33) 
m OF Or Op 
This is now in a form that allows us to start manipulating the collision integral on the 


right-hand-side of (2.30). We have 
(+) _ [erat OU (Tri — r2) Ofe 
coll 


Ot Or, op 
, U(r) [a a 
= 3. 43 : 
a fa r2d°p2 OF E | fo 
_i 3. 3 > > Oh 
E7 ee rod’ po (Di P2) ar (2.34) 


where in the second line the extra term 0/Op> vanishes if we integrate by parts and, 
in the third line, we’ve used our equilibrium condition (2.33), with the limits on the 
integral in place to remind us that only the region r < d contributes to the collision 
integral. 


A Review of Scattering Cross Sections 


To complete the story, we still need to turn (2.34) into the collision integral (2.24). 
But most of the work simply involves clarifying how the scattering rate w(p, polpi, D4) 
is defined for a given inter-atomic potential U(r — r2). And, for this, we need to review 
the concept of the differential cross section. 
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Figure 4: The differential cross section. 


Let’s think about the collision between two particles. They start with momenta 
pi; = mv; and end with momenta p; = mv/ with i = 1,2. Now let’s pick a favourite, 
say particle 1. We’ll sit in its rest frame and consider an onslaught of bombarding 
particles, each with velocity v — v1. This beam of incoming particles do not all hit our 
favourite boy at the same point. Instead, they come in randomly distributed over the 
plane perpendicular to v2 — vı. The flux, J, of these incoming particles is the number 
hitting this plane per area per second, 


N 
I = —(|e. — Ñ. 
y” v| 


Now spend some time staring at Figure 4. There are a number of quantities defined 
in this picture. First, the impact parameter, b, is the distance from the asymptotic 
trajectory to the dotted, centre line. We will use b and ¢ as polar coordinates to 
parameterize the plane perpendicular to the incoming particle. Next, the scattering 
angle, 0, is the angle by which the incoming particle is deflected. Finally, there are two 
solid angles, do and dQ, depicted in the figure. Geometrically, we see that they are 
given by 


do =bdbd@ and dQ = sinddéd@ 


The number of particles scattered into dQ in unit time is Jda. We usually write this as 


do 
2 = 2. 
Igt Ibdbdo (2.35) 
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Figure 5: On the left: a point particle scattering off a hard sphere. On the right: a hard 
sphere scattering off a hard sphere. 


where the differential cross section is defined as 
do b |\db} 1 
dQ dð) 2 


You should think of this in the following way: for a fixed (> — 1), there is a unique 


d(b?) 


2? 
dcos@ 238) 


sin 0 


relationship between the impact parameter b and the scattering angle 0 and, for a given 
potential U(r), you need to figure this out to get |da/dQ| as a function of 0. 


Now we can compare this to the notation that we used earlier in (2.17). There we 
talked about the rate of scattering into a small area dèpi d?p, in momentum space. But 
this is the same thing as the differential cross-section. 


> >. yp >l ‘ = — do 
w(P, Po; Pi Pa) dp dp, = |v — a] Fe dQ (2.37) 


(Note, if you're worried about the fact that d*p/d°p is a six-dimensional area while 
dQ is a two dimensional area, recall that conservation of energy and momenta provide 
four restrictions on the ability of particles to scatter. These are implicit on the left, 
but explicit on the right). 


An Example: Hard Spheres 


In Section 1.2, we modelled atoms as hard spheres of diameter d. It’s instructive to 
figure out the cross-section for such a hard sphere. 


In fact, there are two different calculations that we can do. First, suppose that we 
throw point-like particles at a sphere of diameter d with an impact parameter b < d/2 
From the left-hand diagram in Figure 5, we see that the scattering angle is 6 = 7 — 2a, 
where 
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or 


2 2 
b= £ cos? $ = (1 + cos 6) 


From (2.36), we then find the differential cross-section 


d2 
16 


do 
dQ 


The total cross-section is defined as 


T do d\? 
or = on | a sino 5% = (5) 


This provides a nice justification for the name because this is indeed the cross-sectional 
area of a sphere of radius d/2. 


Alternatively, we could consider two identical hard spheres, each of diameter d, one 
scattering off the other. Now the geometry changes a little, as shown in the right-hand 
diagram in Figure 5. The impact parameter is now the distance between the centres of 
the spheres, and given by 


d 
b=2 x -sing 
2 
Clearly we now need b < d. The same calculation as above now gives 
op = rd 


This is the same effective cross-sectional area that we previously used back in Section 
1.2 when discussing basic aspects of collisions. 


Almost Done 


With this refresher course on classical scattering, we can return to the collision integral 
(2.34) in the Boltzmann equation. 


Of, _ 3 3 = = Ofe 
( Ot ) 7 faaea’ me ie m K or 


We’ll work in cylindrical polar coordinates shown in Figure 6. The direction parallel 
to Uy — vı is parameterized by x; the plane perpendicular is parameterised by ¢ and 


=i] = 


Figure 6: Two particle scattering 


the impact parameter b. We’ve also shown the collision zone in this figure. Using the 
definitions (2.35) and (2.37), we have 


OF 22 Tee aes = Of, 
(F) = [ae "m a f aoa f Ox 
= | boatety, w(Pi, PalP, D2) [fo(x2) = fo(x1)] 


It remains only to decide what form the two-particle distribution function f> takes just 
before the collision at x = x, and just after the collision at x = x2. At this point we 
invoke the assumption of molecular chaos. Just before we enter the collision, we assume 
that the two particles are uncorrelated. Moreover, we assume that the two particles 
are once again uncorrelated by the time they leave the collision, albeit now with their 
new momenta 


falai) = fi P t) f(r, post) and falza) = fil? i; t) fil? Do; t) 


Notice that all functions fı are evaluated at the same point 7 in space since we’ve 
assumed that the single particle distribution function is suitably coarse grained that it 
doesn’t vary on scales of order d. With this final assumption, we get what we wanted: 
the collision integral is given by 


o / ri i Ae i ee: z EA 
(%5) = [pad na'y, w(D1, Palp, P2) [AE BY) fal, p) fran me 
coll 


in agreement with (2.24). 
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2.3 The H-Theorem 


The topics of thermodynamics and statistical mechanics are all to do with the equi- 
librium properties of systems. One of the key intuitive ideas that underpins their 
importance is that if you wait long enough, any system will eventually settle down to 
equilibrium. But how do we know this? Moreover, it seems that it would be rather 
tricky to prove: settling down to equilibrium clearly involves an arrow of time that dis- 
tinguishes the future from the past. Yet the underlying classical mechanics is invariant 
under time reversal. 


The purpose of this section is to demonstrate that, within the framework of the 
Boltzmann equation, systems do indeed settle down to equilibrium. As we described 
above, we have introduced an arrow of time into the Boltzmann equation. We didn’t 
do this in any crude way like adding friction to the system. Instead, we merely assumed 
that particle velocities were uncorrelated before collisions. That would seem to be a 
rather minor input but, as we will now show, it’s enough to demonstrate the approach 
to equilibrium. 


Specifically, we will prove the “H-theorem”, named after a quantity H introduced 
by Boltzmann. (H is not to be confused with the Hamiltonian. Boltzmann originally 
called this quantity something like a German €, but the letter was somehow lost in 
translation and the name H stuck). This quantity is 


H(t) = f Prap APD t) oged) 


This kind of expression is familiar from our first Statistical Mechanics course where we 
saw that the entropy S for a probability distribution p is S = —kgplogp. In other 
words, this quantity H is simply 


S = —kgH 


The H-theorem, first proven by Boltzmann in 1872, is the statement that H always 
decreases with time. The entropy always increases. We will now prove this. 


As in the derivation (2.4), when you’re looking at the variation of expectation values 
you only care about the explicit time dependence, meaning 


ð : ð 
= | rap (log fı + ye = fere log fı oh 


dH 
dt 
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where we can drop the +1 because f fı = N is unchanging, ensuring that f Of,/Ot = 0. 
Using the Boltzmann equation (2.23), we have 


dH OV ah P Of (Əh 
dt Or Op m OF Ot) con 


rdp log fi ( 


But the first two terms in this expression both vanish. You can see this by integrating 
by parts twice, first moving the derivative away from fı and onto log fı, and then 
moving it back. We learn that the change in H is governed entirely by the collision 
terms 
dH 
dt 


Of 
= 343 eae 
7 ome ne I ( Ot J 


= | Erëm pad’ yay, w(Pi, Po|P1, P2) log fi(p1) 
x | PDP) — AEA) (2.38) 


where I’ve suppressed r and t arguments of fı to keep things looking vaguely reasonable 
I’ve also relabelled the integration variable p — pı. At this stage, all momenta are 
integrated over so they are really nothing but dummy variables. Let’s relabel 1 + 2 on 
the momenta. All the terms remain unchanged except the log. So we can also write 


dH yak ct = 
a f d'rd’pıd” pad" pdp, (Pi, P2|Pr, Pa) log fi (D2) 
x [AEDA ED — AB) Fula) (2.39) 


Adding (2.38) and (2.39), we have the more symmetric looking expression 


dH 1 1 1 fo | = = 
= 5 | rb pt pa py, w(Di» P3|P1, P2) log [fi (1) fi) 


d 2 
x [AEAEE -APAE (2.40) 


Since all momenta are integrated over, we’re allowed to just flip the dummy indices 
again. This time we swap p + p” in the above expression. But, using the symmetry 
property (2.20), the scattering function remains unchanged?. We get 

dH 1 3 3 3 OF qo} > >lj> > >/ >I 

Hy | Ee Pd pod pid pz w(Pi, Palpi, Pe) log [fi (0i) fipa) 


3An aside: it’s not actually necessary to assume (2.20) to make this step. We can get away with 
the weaker result 


J Ehh w(Pi, Po|Pi,P2) = J Eoin w(P1, B2lPi, B2) 


which follows from unitarity of the scattering matrix. 
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x [AEDA ED -APAP (2.41) 


Finally, we add (2.40) and (2.41) to get 


dH il Pre 


| log [APD AP) — log B AE APAP) -ADAP 242) 


The bottom line of this expression is a function (log x — log y)(a — y). It is positive for 
all values of x and y. Since the scattering rate is also positive, we have the proof of the 
H-theorem. 


a <0 <= ae > 0 

dt 7 dt 7 

And there we see the arrow of time seemingly emerging from time-invariant Hamiltonian 
mechanics! Clearly, this should be impossible, a point first made by Loschmidt soon 
after Boltzmann’s original derivation. But, as we saw earlier, everything hinges on the 
assumption of molecular chaos (2.22). This was where we broke time-reversal symmetry, 
ultimately ensuring that entropy increases only in the future. Had we instead decided 
in (2.21) that the rate of scattering was proportional to fọ after the collision, again 
assuming fo ~ fıfı then we would find that entropy always decreases as we move into 
the future. 


There is much discussion in the literature about the importance of the H-theorem and 
its relationship to the second law of thermodynamics. Notably, it is not particularly 
hard to construct states which violate the H-theorem by virtue of their failure to obey 
the assumption of molecular chaos. Nonetheless, these states still obey a suitable second 
law of thermodynamics’. 


The H-theorem is not a strict inequality. For some distributions, the entropy remains 
unchanged. From (2.42), we see that these obey 


FD fis) — frlar) fr (2) 


But this is simply the requirement of detailed balance (2.25). And, as we have seen al- 
ready, this is obeyed by any distribution satisfying the requirement of local equilibrium 
(2.29). 


4This was first pointed out by E. T. Jaynes in the paper “Violation of Boltzmann’s H Theorem in 
Real Gases” , published in Physical Review A, volume 4, number 2 (1971). 
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2.4 A First Look at Hydrodynamics 


Hydrodynamics is what you get if you take thermodynamics and splash it. You know 
from your first course on Statistical Mechanics that, at the most coarse grained level, 
the equilibrium properties of any system are governed by the thermodynamics. In the 
same manner, low energy, long wavelength, excitations of any system are described by 
hydrodynamics. 


More precisely, hydrodynamics describes the dynamics of systems that are in local 
equilibrium, with parameters that vary slowly in space in time. As we will see, this 
means that the relevant dynamical variables are, in the simplest cases, 


e Density p(7,t) = mn(r,t) 
e Temperature T(7’, t) 
e Velocity (r,t) 


Our goal in this section is to understand why these are the relevant variables to describe 
the system and to derive the equations that govern their dynamics. 


2.4.1 Conserved Quantities 


We'll start by answering the first question: why are these the variables of interest? The 
answer is that these are quantities which don’t relax back down to their equilibrium 
value in an atomic blink of an eye, but instead change on a much slower, domestic time 
scale. At heart, the reason for they have this property is that they are all associated 
to conserved quantities. Let’s see why. 


Consider a general function A(7, p) over the single particle phase space. Because we 
live in real space instead of momentum space, the question of how things vary with 
r is more immediately interesting. For this reason, we integrate over momentum and 
define the average of a quantity A(7r,p) to be 


_ [bp AP DAEB 
J Bp fil", D; t) 


However, we’ve already got a name for the denominator in this expression: it is the 


number density of particles 


nfr,t) = I dofet) (2.43) 
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(As a check of the consistency of our notation, if you plug the local equilibrium dis- 
tribution (2.29) into this expression, then the n(7,t) on the left-hand-side equals the 
n(7,t) defined in (2.29)). So the average is 


1 
n(7,t) 


It’s worth making a couple of simple remarks. Firstly, this is different from the average 


(ACF, t)) = 


f Bp AF DAF Bt) (2.44) 


that we defined earlier in (2.3) when discussion Liouville evolution. Here we’re inte- 
grating only over momenta and the resulting average is a function of space. A related 
point is that we’re at liberty to take functions which depend only on 7 (and not on p) 
in and out of the (-) brackets. So, for example, (nA) = n(A). 


We’re interested in the how the average of A changes with time. We looked at this 
kind of question for Liouville evolution earlier in this section and found the answer 
(2.5). Now we want to ask the same question for the Boltzmann equation. Before we 
actually write down the answer, you can guess what it will look like: there will be a 
streaming term and a term due to the collision integral. Moreover, we know from our 
previous discussion that the term involving the collision integral will vary much faster 
than the streaming term. 


Since we’re ultimately interested in quantities which vary slowly, this motivates look- 
ing at functions A which vanish when integrated against the collision integral. We will 
see shortly that the relevant criterion is 


a ofi _ 
Jè AG) (ar)... = 


We’d like to find quantities A which have this property for any distribution fı. Using 
our expression for the collision integral (2.23), we want 


[bout pat pp BIB. Be) APA) [AUT PAE BS) - PPAF P] = 0 
This now looks rather similar to equation (2.38), just with the log f replaced by A. In- 


deed, we can follow the steps between (2.38) and (2.41), using the symmetry properties 
of w, to massage this into the form 


[borat pay, w(P;, P>|P1, P2) EAA — fil(p1) fi(p2) 


x | AC, Bi) + AQ Be) — ACF, Bt) — AQF, B)] = 0 
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Now it’s clear that if we want this to vanish for all distributions, then A itself must 
have the property that it remains unchanged before and after the collision, 


A(T, pi) + A(T, p2) = A(T, Pi) + A(T, pa) (2.45) 


Quantities which obey this are sometimes called collisional invariants. Of course, in 
the simplest situation we already know what they are: momentum (2.18) and energy 
(2.19) and, not forgetting, the trivial solution A = 1. We’ll turn to each of these in 
turn shortly. But first let’s derive an expression for the time evolution of any quantity 
obeying (2.45). 


Take the Boltzmann equation (2.23), multiply by a collisional invariant A(r, p) and 
integrate over f dp. Because the collision term vanishes, we have 


Ss ð P @ p 0 pye 
[earn (gE Br 2) AERD =o 
where the external force is F = —VV. We'll integrate the last term by parts (remem- 
bering that the force F can depend on position but not on momentum). We can’t 
integrate the middle term by parts since we’re not integrating over space, but nonethe- 
less, we'll also rewrite it. Finally, since A has no explicit time dependence, we can take 
it inside the time derivative. We have 


a fis ð fa P , P OA 5 2 OA, 
5 f te ates ftp Zaz fae 2.2 s— fap Fp =o 


Although this doesn’t really look like an improvement, the advantage of writing it in 
this way is apparent when we remember our expression for the average (2.44). Using 
this notation, we can write the evolution of A as 


o OA = OA 


ð 
LpA a Aen ea Tasi 
ae y+ (nvA) — n(@ ae! nF Op 


= \=0 (2.46) 


where @ = p/m. This is our master equation that tells us how any collisional invariant 
changes. The next step is to look at specific quantities. There are three and we’ll take 
each in turn 


Density 


Our first collisional invariant is the trivial one: A = 1. If we plug this into (2.46) we 
get the equation for the particle density n(r, t), 


L. (nit) (2.47) 
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where the average velocity ț of the particles is defined by 
u(r, t) = (0) 


Notice that, once again, our notation is consistent with earlier definitions: if we pick 
the local equilibrium distribution (2.29), the (r,t) in (2.29) agrees with that defined 
above. The result (2.47) is the continuity equation, expressing the conservation of 
particle number. Notice, however, that this is not a closed expression for the particle 
density n: we need to know the velocity wu as well. 


It’s useful to give a couple of extra, trivial, definitions at this stage. First, although 
we won't use this notation, the continuity equation is sometimes written in terms of 


=> 


the current, J(r,t) = n(r,t)u(7,t). In what follows, we will often replace the particle 
density with the mass density, 


p(T, t) = mn(r,t) 


Momentum 


Our next collisional invariant is the momentum. We substitute A = mv into (2.46) to 
find 


ð ð 
g mnui) + o — (nF;) = 0 (2.48) 


We can play around with the middle term a little. We write 


(vjvi) = (vz — uj) (vi — u)) + ui(vj) + uj (vi) — iiuj 


= ((v; — uj) (vi — ui)) + uiu 
We define a new object known as the pressure tensor, 
Pig = Pj = p((vj — uj) (vi — ui) 


This tensor is computing the flux of 7-momentum in the j-direction. It’s worth pausing 
to see why this is related to pressure. Clearly, the exact form of P;; depends on the 
distribution of particles. But, we can evaluate the pressure tensor on the equilibrium, 
Maxwell-Boltzmann distribution (2.28). The calculation boils down to the same one 
we did in our first Statistical Physics course to compute equipartition: you find 
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which, by the ideal gas law, is proportional to the pressure of the gas. Using this 
definition — together with the continuity equation (2.47) — we can write (2.48) as 


a a j a 


This is the equation which captures momentum conservation in our system. Indeed, it 
has a simple interpretation in terms of Newton’s second law. The left-hand-side is the 
acceleration of an element of fluid. The combination of derivatives is sometimes called 
the material derivative, 


pee o 


ait “iar; (2.51) 


It captures the rate of change of a quantity as seen by an observer swept along the 
streamline of the fluid. The right-hand side of (2.50) includes both the external force 
F and an additional term involving the internal pressure of the fluid. As we will see 
later, ultimately viscous terms will also come from here. 


Note that, once again, the equation (2.50) does not provide a closed equation for 
the velocity ü. You now need to know the pressure tensor P;; which depends on the 
particular distribution. 


Kinetic Energy 


Our final collisional invariant is the kinetic energy of the particles. However, rather 
than take the absolute kinetic energy, it is slightly easier if we work with the relative 
kinetic energy, 


If we substitute this into the master equation® (2.46), the term involving the force 
vanishes (because (v; — ui) = 0). However, the term that involves 0E /ðr; is not zero 
because the average velocity ù depends on 7. We have 


19 
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5There is actually a subtlety here. In deriving the master equation (2.46), we assumed that A has 
no explicit time dependence, but the A defined above does have explicit time dependence through 
u(r, t). Nonetheless, you can check that (2.46) still holds, essentially because the extra term that you 
get is ~ ((U— u) - OW/Ot) = (V — ü) - OU/Ot = 0. 
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At this point, we define the temperature, T(7,t) of our non-equilibrium system. To do 
so, we fall back on the idea of equipartition and write 


Ska (Ft) = 5m((0 — F, t)’ (2.53) 


This coincides with our familiar definition of temperature for a system in local equilib- 
rium (2.29), but now extends this to a system that is out of equilibrium. Note that the 
temperature is a close relative of the pressure tensor, TrP = 3pkgT/m. 


We also define a new quantity, the heat fluz, 


1 


q = 5mo((vi- w) (Ba) (2.54) 


(This actually differs by an overall factor of m from the definition of g that we made in 
Section 1. This has the advantage of making the formulae we’re about to derive a little 
cleaner). The utility of both of these definitions becomes apparent if we play around 
with the middle term in (2.52). We can write 

1 


5inp((v— u) (B=?) + Smpud(T a) 


3 
= qi + zPuikBT 
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Invoking the definition of the pressure tensor (2.49), we can now rewrite (2.52) as 


30 
2 Ot 


o 3 Ou; 
a, (pkeT) + Ir, (a + spuikT ) +mP;j = i Jz, =0 


Because P;; = Pji, we can replace Ou,;/Or; in the last term with the symmetric tensor 
known as the rate of strain (and I promise this is the last new definition for a while!) 


U= (= = =) ee 


Finally, with a little help from the continuity equation (2.47), our expression for the 
conservation of energy becomes 


o o 20q; 2m 2 
aC + uiz -) kepT ++ 3 ðr, Tog Vuta =0 (2.56) 


It’s been a bit of a slog, but finally we have three equations describing how the particle 


density n (2.47), the velocity t (2.50) and the temperature T (2.56) change with time. 
It’s worth stressing that these equations hold for any distribution fı. However, the 
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set of equations are not closed. The equation for n depends on wu; the equation for w 
depends on P;; and the equation for T (which is related to the trace of P;;) depends 
on a new quantity g. And to determine any of these, we need to solve the Boltzmann 
equation and compute the distribution fı. But the Boltzmann equation is hard! How 
to do this? 


2.4.2 Ideal Fluids 


We start by simply guessing a form of the distribution function f, (7, p; t). We know that 
the collision term in the Boltzmann equation induces a fast relaxation to equilibrium, 
so if we’re looking for a slowly varying solution a good guess is to take a distribution 
for which (Of; /Ot).o1 = 0. But we’ve already met distribution functions that obey this 
condition in (2.29): they are those describing local equilibrium. Therefore, our first 
guess for the distribution, which we write as A" is local equilibrium 


m 
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where p = mv. In general, this distribution is not a solution to the Boltzmann equation 


since it does not vanish on the streaming terms. Nonetheless, we will take it as our 
first approximation to the true solution and later see what we’re missing. 


The distribution is normalized so that the number density and temperature defined 
in (2.43) and (2.53) respectively coincide with n(7,t) and T (r,t) in (2.29). But we can 
also use the distribution to compute P;; and g. We have 


Pij = kan? t)T (r,t) dij = P(r, t) di (2.58) 


and g = 0. We can substitute these expressions into our three conservation laws. The 
continuity equation (2.47) remains unchanged. Written in terms for p = mn, it reads 


aa Ou; 
(3 j u) pa P Tr 0 (2.59) 


Meanwhile, the equation (2.50) governing the velocity flow becomes the Euler equation 
describing fluid motion 


o o 10P_ F; 
j iT = 2 
(5. tus =) = por; m ee 
and the final equation (2.56) describing the flow of heat reduces to 
F Uj T4 =0 2.61 
(3 “ Z) 3 ðr; eon) 
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These set of equations describe the motion of an ideal fluid. While they are a good 
starting point for describing many properties of fluid mechanics, there is one thing that 
they are missing: dissipation. There is no irreversibility sown into these equations, no 
mechanism for the fluid to return to equilibrium. 


We may have anticipated that these equations lack dissipation. Their starting point 
was the local equilibrium distribution (2.57) and we saw earlier that for such distribu- 
tions Boltzmann’s H-function does not decrease; the entropy does not increase. In fact, 
we can also show this statement directly from the equations above. We can combine 
(2.59) and (2.60) to find 


which tells us that the quantity pT—%/? is constant along streamlines. But this is the 
requirement that motion along streamlines is adiabatic, not increasing the entropy. To 
see that this is the case, you need to go back to your earlier statistical mechanics or 


thermodynamics course® 


. The usual statement is that for an ideal gas, an adiabatic 
transformation leaves VT°/? constant. Here we’re working with the density p = mN/V 
and this becomes pT~*/? is constant. Note, however, that in the present context p and 
T are not numbers, but functions of space and time: we are now talking about a local 


adiabatic change. 


Sound Waves 


It is also simple to show explicitly that one can set up motion in the ideal fluid that 
doesn’t relax back down to equilibrium. We start with a fluid at rest, setting ù = 0 
and p = p and T = T, with both p and T constant. We now splash it (gently). That 
means that we perturb the system and linearise the resulting equations. We’ll analyse 
these perturbations in Fourier modes and write 


pF, t) =ptdpe@*) and T(P t) = T+ ôT eer) (2.62) 


Furthermore, we’ll look for a particular kind of perturbation in which the fluid motion 
is parallel to the perturbation. In other words, we’re looking for a longitudinal wave 


D(F, t) = kduett-F*) (2.63) 
The linearised versions of (2.59), (2.60) and (2.61) then read 
w 
Ikl 


6See, for example, the discussion of the Carnot cycle in the lectures on Statistical Physics. 
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Y= a 
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There is one solution to these equations with zero frequency, w = 0. These have du = 0 
while dp = —p and ôT = T. (Note that this notation hides a small e. It really means 
that dp = —ep and ôT = eT. Because the equations are linear and homogeneous, you 
can take any € you like but, since we’re looking at small perturbations, it should be 
small). This solution has the property that P = mnkgT is constant. But since, in 
the absence of an external force, pressure is the only driving term in (2.60), the fluid 
remains at rest, which is why du = 0 for this solution. 


Two further solutions to these equations both have dp = p, ôT = ¿T and du = w/|k| 
with the dispersion relation 


= |5kgT 
w = +v.|k| with v= (2.64) 
3m 


These are sound waves, the propagating version of the adiabatic change that we saw 


above: the combination pT~?/? is left unchanged by the compression and expansion of 
the fluid. The quantity v, is the speed of sound. 


2.5 Transport with Collisions 


While it’s nice to have derived some simple equations describing fluid mechanics, as 
we’ve seen they’re missing dissipation. And, since the purported goal of these lectures 
is to understand how systems relax back to equilibrium, we should try to see what 
we’ve missed. 


In fact, it’s clear what we’ve missed. Our first guess for the distribution function was 
local equilibrium 


3/2 ; 
CRD =F) (Srey) (ar Eae) ee) 


We chose this on the grounds that it gives a vanishing contribution to the collision 
integral. But we never checked whether it actually solves the streaming terms in the 
Boltzmann equation. And, as we will now show, it doesn’t. 
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Using the definition of the Poisson bracket and the one-particle Hamiltonian H, 
(2.11), we have 


O gay, ry = OF, x OLD OY 
ðt L w ðp OF 
Now the dependence on p = mv in local equilibrium is easy: it is simply 
af. l-a esti 
Op” mr = ü)fi 


Meanwhile all r dependence and t dependence of fo lies in the functions n(r, t), T(P, t) 
and (r,t). From (2.65) we have 


Of AY 

On n 
afo _ 3 fO | m g- mfo 
OT 2 T ` 2kpT? i 
a” 2 ge a) f 

Ou kpT 1 


Using all these relations, we have 


(0) 7 i a\2 = 
oh — {Fy f} = [Bin E a ) Bar 


M py os a ee ge Sa col 
N g-o- Da- F(t- 2, 
toT” t) - Dit ET (u a] fi” (2,66) 


where we’ve introduced the notation D, which differs from the material derivative D; 
in that it depends on the velocity v rather than the average velocity w, 


= _ 0o , ô an ô 
Ian ap esi 


Now our first attempt at deriving hydrodynamics gave us three equations describing 
how n (2.59), @ (2.60) and T (2.61) change with time. We substitute these into (2.66). 
You'll need a couple of lines of algebra, cancelling some terms, using the relationship 
P = nkpT and the definition of Uj; in (2.55), but it’s not hard to show that we 
ultimately get 


(0) 
A — (8, 4} = | (sep @- a - 5) @-]-VT (2.67) 


And there’s no reason that the right-hand-side is zero. So, unsurprisingly, fO does 
not solve the Boltzmann equation. However, the remaining term depends on VT and 
Ou/Or which means that we if we stick to long wavelength variations in the temperature 
and velocity then we almost have a solution. We need only add a little extra something 
to the distribution 


f= flO +6h (2.68) 
Let’s see how this changes things. 


2.5.1 Relaxation Time Approximation 


The correction term, 6f;, will contribute to the collision integral (2.24). Dropping the 
r argument for clarity, we have 


(E) = fermety a oe lb ADAE -APAP 
= f BP pad pid ph w(i PAB, Ba) [FO PIAP) + 4f PL) 
-HO (PNAP) — SFA Pe) 
where, in the second line, we have used the fact that fO vanishes in the collision 


integral and ignored quadratic terms ~ 6 f?. The resulting collision integral is a linear 
function of 6 fı. But it’s still kind of a mess and not easy to play with. 


At this point, there is a proper way to proceed. This involves first taking more care 
in the expansion of df; (using what is known as the Chapman-Enskog expansion) and 
then treating the linear operator above correctly. However, there is a much easier way 
to make progress: we just replace the collision integral with another, much simpler 
function, that captures much of the relevant physics. We take 


aA) Öh 
(Sr). 7 F. oi 


where 7 is the relaxation time which, as we’ve already seen, governs the rate of change 
of fı. In general, T could be momentum dependent. Here we’ll simply take it to be a 
constant. 


The choice of operator (2.69) is called the relaxation time approximation. (Sometimes 
it is referred to as the Bhatnagar-Gross-Krook operator). It’s most certainly not exact. 
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In fact, it’s a rather cheap approximation. But it will give us a good intuition for what’s 
going on. With this replacement, the Boltzmann equation becomes 


af? + df) 
Ot 


oft 


T 


- {hi ft? +f} =— 


But, since df, << Fine we can ignore df; on the left-hand-side. Then, using (2.67), we 
have a simple expression for the extra contribution to the distribution function 


1 
eg (o = tis Uy = U;) = a= Dis) Us (0) (2.70) 


We can now use this small correction to the distribution to revisit some of the transport 
properties that we saw in Section 1. 


2.5.2 Thermal Conductivity Revisited 
Let’s start by computing the heat flux 


1 >o 
qi = zmp((vi — u) (T — @)’) (2.71) 
using the corrected distribution (2.68). We’ve already seen that the local equilibrium 
distribution fO gave g = 0, so the only contribution comes from 6f;. Moreover, only 
the first term in (2.70) contributes to (2.71). (The other is an odd function and vanishes 
when we do the integral). We have 


q= -KVT 


This is the same phenomenological law that we met in (1.12). The coefficient « is the 
thermal conductivity and is given by 


o TATP 3a, (F 2/2 A2 > m2 5 (0) 
r= TE | dp -80-a sexe a3 i 

o mp|| m 6, I, 4 

= TE en lo — Ste] 


In the second line, we’ve replaced all (v — u) factors with v by performing a (7 
dependent) shift of the integration variable. The subscript (-)o means that these aver- 
ages are to be taken in the local Maxwell-Boltzmann distribution FO with u = 0. These 


LA = 


integrals are simple to perform. We have (v*)) = 15k37?/m? and (v°)9 = 105k37T?/m?, 
giving 
5 
k= 5 nk pT 


The factor of 5/2 here has followed us throughout the calculation. The reason for its 
presence is that its the specific heat at constant pressure, Cp = 2 kp. 


This result is parameterically the same that we found earlier in (1.13). (Although 
you have to be a little careful to check this because, as we mentioned after (2.54), 
the definition of heat flux differs and, correspondingly, «K, differs by a factor of m. 
Moreover, the current formula is written in terms of slightly different variables. To 
make the comparison, you should rewrite the scattering time as T ~ 1 /mony/(v?), 
where ø is the total cross-section and (v2) ~ T/m by equipartition). The coefficient 
differs from our earlier derivation, but it’s not really to be trusted here, not least 
because the only definition of 7 that we have is in the implementation of the relaxation 
time approximation. 


We can also see how the equation (2.56) governing the flow of temperature is related 
to the more simplistic heat flow equation that we introduced in (1.14). For this we 
need to assume both a static fluid w = 0 and also that we can neglect changes in the 
thermal conductivity, 0k /0r ~ 0. Then equation (2.56) reduces to the heat equation 


T 2, 
— = i KVT 
Pie = — 38 


2.5.3 Viscosity Revisited 


Let’s now look at the shear viscosity. From our discussion in Section 1, we know that 
the relevant experimental set-up is a fluid with a velocity gradient, Ou,/0z # 0. The 
shear viscosity is associated to the flux of z-momentum in the z-direction. But this is 
precisely what is computed by the off-diagonal component of the pressure tensor, 


Pox = P((Vx = Ug) (vz = uz)) 


We’ve already seen that the local equilibrium distribution gives a diagonal pressure 
tensor (2.58), corresponding to vanishing viscosity. What happens if we use the cor- 
rected distribution (2.68)? Now only the second term in (2.70) contributes (since the 
first term is an odd function of (v — u)). We write 
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where the extra term I; is called the stress tensor and is given by 


la 
Il; = a Uu f ap (vj — uj) (vi — us) ( (op — u) (vr — w) — = (T — By ) FO 
kgT 3 


mMTp 


1 
= Pig Ori einjer = sulon) 


Before we compute Il;;, note that it is a traceless tensor. This is because the first 


ijs 
term above becomes (v?v;,07)9 = jg (V’VzVr}o Which is easily calculated to be (v?v?)9 = 
5kRT?/m? = 3(v*)o. Moreover, I;; depends linearly on the tensor U;;. These two facts 


mean that I;; must be of the form 
1 7 


In particular, if we set up a fluid gradient with Ou,/0z 4 0, we have 


Ouz 
Oz 


T; =) 


which tells us that we should identify 7 with the shear viscosity. To compute it, we 
return to a general velocity profile which, from (2.73), gives 
mT p 


1 
H= pT" (UzV2VKUI)0 = gon V2" )o 


= 7p ss a Uzr) U0 050s) 0 
_ 2mTp 4 
I5kpT Ot” 0 


Comparing to (2.73), we get an expression for the coefficient n, 
n = nkgTr 


Once again, this differs from our earlier more naive analysis (1.11) only in the overall 
numerical coefficient. And, once again, this coefficient is not really trustworthy due to 
our reliance on the relaxation time approximation. 


The scattering time T occurs in both the thermal conductivity and the viscosity. Tak- 
ing the ratio of the two, we can construct a dimensionless number which characterises 
our system. This is called the Prandtl number, 
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With c, the specific heat at constant pressure which takes the value cp = 5kg/2 for a 
monatomic gas. Our calculations above give a Prandtl number Pr = 1. Experimental 
data for monatomic gases shows a range of Prandtl numbers, hovering around Pr = 2/3. 
The reason for the discrepancy lies in the use of the relaxation time approximation. A 
more direct treatment of the collision integral, thought of as a linear operator acting 
on ôf, gives the result Pr = 2/3, in much better agreement with the data’. 


2.6 A Second Look: The Navier-Stokes Equation 


To end our discussion of kinetic theory, we put together our set of equations governing 
the conservation of density, momentum and energy with the corrected distribution 
function. The equation of motion for density fluctuations doe not change: it remains, 


= +V- (pū) =0 (2.74) 


Meanwhile the equation for momentum (2.50) now has an extra contribution from the 
stress tensor contribution (2.72). Moreover, we typically assume that, to leading order, 
variations in the viscosity can be neglected: Vn ~ 0. Written in vector notation rather 
than index notation, the resulting equation is 


ð Fol 
hey pS CVs Vt iy ee) (2.75) 
ot m p p 3p 

This is the Navier-Stokes equation. Finally, we have the heat conduction equation. We 
again drop some terms on the grounds that they are small. This time, we set Vk ~ 0 
and U;,I1,; ~ 0; both are small at the order we are working to. We're left with 


o 2 2m 
— +0- V| T-T += PV- ea 
p ( DE +u v) zV + 3 Vu 
We can again look at fluctuations of these equations about a static fluid with p = J, 
T = T and & = 0. Longitudinal fluctuations (2.62) and (2.63) now give rise to the 


linearised equations of motion, 


wip = A\k\ou 
kpT > kp > 4n|k|? 

ip ip ar a 
mp m 3 
— KR? 

ôT = =T|k|ou — iP 

wW 3 |klou — i kei ô 


TYou can read about this improved calculations in the lectures by Daniel Arovas. 
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Notice that terms involving transport coefficients 7 and x each come with a factor of 
i; this is a sign that they will give rise to dissipation. To compute the frequencies of 
the different modes, it’s best to think of this as an eigenvalue problem for w/|k|; the 
coefficients of the various terms on the right-hand-side define a 3 x 3 matrix M, with 


2i KR AT 
det M = ae | 


(4. 26 |kD?P 
d TrM=-— 
an r i(i) 5 


mp 


The product of the three eigenvalues is equal to det M. We know that for the ideal 
fluid, the eigenvalues are zero and w = +v,|k| where vs is the sound speed computed 
in (2.64). Let’s first look at the eigenvalue that was zero, corresponding to fluctuations 
of constant pressure. Working to leading order in & and 7, we must have 
21712 2i K > 
—vi |k w =det M => w=- |k| 
5 kgp 

The purely imaginary frequency is telling us that these modes are damped. The w ~ 
i|k|? is characteristic of diffusive behaviour. 


The remaining two modes are related to the sound waves. These too will gain a 
dispersive contribution, now with 


w = t0,|k| — iy (2.76) 


Using the fact that the sum of the eigenvalues is equal to the trace, we find 


2 2 k\ Jk 
= | 20 
v= (Grr E)E (2.77) 


The fluctuations above are all longitudinal. There are also two shear modes, whose 
fluctuations are in a direction perpendicular to the velocity. It is simple to check that 
the linearised equations are solved by dp = ôT = 0 and 6u-k, with the frequency given 
by 


kl? 
nk 

p 
Once again, we see that these modes behave diffusively. 


Navier Stokes Equation and Liquids 


Our derivation of the Navier-Stokes equation relied on the dilute gas approximation. 
However, the equation is more general than that. Indeed, it can be thought of as the 
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most general expression in a derivative expansion for momentum transport (subject to 
various requirements). In fact, there is one extra parameter that we could include: 


ð F ? 
p(2 tiv) ü= — VP +nVit (Z+) vive a) 
where Ç is the bulk viscosity which vanished in our derivation above. Although the 
equation above governs transport in liquids, we should stress that first-principles com- 
putations of the viscosity (and also thermal conductivity) that we saw previously only 
hold in the dilute gas approximation. 
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3. Stochastic Processes 


We learn in kindergarten about the phenomenon of Brownian motion, the random 
jittery movement that a particle suffers when it is placed in a liquid. Famously, it is 
caused by the constant bombardment due to molecules in the surrounding the liquid. 
Our goal in this section is to introduce the mathematical formalism that allows us to 
model such random behaviour. 


3.1 The Langevin Equation 


In contrast to the previous section, we will here focus on just a single particle. However, 
this particle will be sitting in a background medium. If we know the force F acting on 
the particle, its motion is entirely deterministic, governed by 


m? = —yi+F (3.1) 


In contrast to the previous section, this is not a Hamiltonian system. This is because 
we have included a friction term with a coefficient y. This arises due to the viscosity, n, 
of the surrounding liquid that we met in the previous section. If we model the particle 
as a sphere of radius a then there is a formula due to Stokes which says y = 677a. 
However, in what follows we shall simply treat y as a fixed parameter. In the presence 
of a time independent force, the steady-state solution with Z=0 is 


a 
zef 
y 


For this reason, the quantity 1/y is sometimes referred to as the mobility. 


Returning to (3.1), for any specified force F, the path of the particle is fully deter- 
mined. This is seemingly at odds with the random behaviour observed in Brownian 
motion. The way in which we reconcile these two points is, hopefully, obvious: in 
Brownian motion the force F that the particle feels is itself random. In fact, we will 
split the force into two pieces, 


F=-VV+f(t) 


Here V is a fixed background potential in which the particle is moving. Perhaps V 
arises because the particle is moving in gravity; perhaps because it is attached to a 
spring. But, either way, there is nothing random about V. In contrast, fi (t) is the 
random force that the particle experiences due to all the other atoms in the liquid. It is 


sometimes referred to as noise. The resulting equation is called the Langevin equation 


mr = =y — VV + fit) (3.2) 
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Although it looks just like an ordinary differential equation, it is, in fact, a different 
beast known as a stochastic differential equation. The reason that it’s different is that 
we don’t actually know what f(t) is. Yet, somehow, we must solve this equation 
anyway! 

Let’s clarify what is meant by this. Suppose that you did know the microscopic 
force fi (t) that is experienced by a given particle. Then you could, in principle, go 
ahead and solve the Langevin equation (3.2). But the next particle that you look at 
will experience a different force fí (t) so you'll have to solve (3.2) again. And for the 
third particle, you'll have to solve it yet again. Clearly, this is going to become tedious. 


> 


What’s more, it’s unrealistic to think that we will actually know f(t) in any specific 
case. Instead, we admit that we only know certain crude features of the force fí (t) 
such as, for example, its average value. Then we might hope that this is sufficient 
information to figure out, say, the average value of z(t). That is the goal when solving 


the Langevin equation. 


3.1.1 Diffusion in a Very Viscous Fluid 


We start by solving the Langevin equation in the case of vanishing potential, V = 
0. (For an arbitrary potential, the Langevin equation is an unpleasant non-linear 
stochastic differential equation and is beyond our ambition in this course. However, we 
will discuss some properties of the case with potential is the following section when we 
introduce the Fokker-Planck equation). We can simplify the problem even further by 
considering Brownian motion in a very viscous liquid. In this case, motion is entirely 
dominated by the friction term in the Langevin equation and we ignore the inertial 
term, which is tantamount to setting m = 0. 


When m = 0, we’re left with a first order equation, 


> 


TE 
a(t) = oe 


> 


For any f(t), this can be trivially integrated to give 


= =F 1 ‘ 1 Fip 
z(t) = += | dt! F(t) (3.3) 


At this point, we can’t go any further until we specify some of the properties of the 


noise f(t). Our first assumption is that, on average, the noise vanishes at any given 
time. We will denote averages by (-), so this assumption reads 


> 


(f(@)) =0 (3.4) 
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Taking the average of (3.3) then gives us the result: 


(r(t)) = £(0) 


This is deeply unsurprising: if the average noise vanishes, the average position of the 
particle is simply where we left it to begin with. Nonetheless, it’s worth stressing that 
this doesn’t mean that all particles sit where you leave them. It means that if you 
drop many identical particles at the origin, #(0) = 0, then they will all move but their 
average position — or their centre of mass — will remain at the origin. 


We can get more information by looking at the variance of the position, 


This will tell us the average spread of the particles. We can derive an expression for 
the variance by first squaring (3.3) and then taking the average, 


(a(t) - #0)") = 5 / at f dt, ( Ft) FE) (3.5) 


In order to compute this, we need to specify more information about the noise, namely 
its correlation function ( f;(t1)f;(t2) ) where we have resorted to index notation, i, 7 = 
1,2,3 to denote the direction of the force. This is specifying how likely it is that the 
particle will receive a given kick fj at time tz given that it received a kick f; at time tı. 


In many cases of interest, including that of Brownian motion, the kicks imparted by 
the noise are both fast and uncorrelated. Let me explain what this means. Suppose 
that a given collision between our particle and an atom takes time Teon. Then if we 
focus on time scales less than Teon then there will clearly be a correlation between the 
forces imparted on our particle because these forces are due to the same process that’s 
already taking place. (If an atom is coming in from the left, then it’s still coming in 
from the left at a time t < Teon later). However if we look on time scales t >> Teon, the 
force will be due to a different collision with a different atom. The statement that the 
noise is uncorrelated means that the force imparted by later collisions knows nothing 
about earlier collisions. Mathematically, this means 


(filti) fj(te)) =0 when tz- ti > Toon 


The statement that the collisions are fast means that we only care about time scales 
tə — tı >> Teo and so can effectively take the limit Ten — 0. However, that doesn’t 
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quite mean that we can just ignore this correlation function. Instead, when we take 
the limit Teon —> 0, we’re left with a delta-function contribution, 


( filt) fj (t2) ) = 2D? bi; 5 (te — t1) (3.6) 


Here the factor of 7? has been put in for convenience. We will shortly see the inter- 
pretation of the coefficient D, which governs the strength of the correlations. Noise 
which obeys (3.4) and (3.6) is often referred to as white noise. It is valid whenever the 
environment relaxes back down to equilibrium much faster than the system of interest. 
This guarantees that, although the system is still reeling from the previous kick, the 
environment remembers nothing of what went before and kicks again, as fresh and 
random as the first time. 


Using this expression for white noise, the variance (3.5) in the position of the particles 

is 
{ (Z) — £(0))?) =6Dt (3.7) 
This is an important result: the root-mean square of the distance increases as vt with 
time. This is characteristic behaviour of diffusion. The coefficient D is called the 


diffusion constant. (We put the factor of 7? in the correlation function (3.6) so that 
this equation would come out nicely). 


3.1.2 Diffusion in a Less Viscous Liquid 
Let’s now return to the Langevin equation (3.2) and repeat our analysis, this time 


retaining the inertia term, som #4 0. We will still set V = 0. 


As before, computing average quantities — this time both velocity (Z(t) } and posi- 
tion (Z(t)) is straightforward and relatively uninteresting. For a given f(t), it is not 
difficult to solve (3.2). After multiplying by an integrating factor e7/™, the equation 
becomes 


d /: 1 > 
— ( petm Ny — = yt/m 
a (2) = o 


which can be happily integrated to give 


> 


1 fi ; 
Z(t) = Z(0)e 77 + — i ese (3.8) 
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We now use the fact that the average of noise vanishes (3.4) to find that the average 
velocity is simply that of a damped particle in the absence of any noise, 


((1)) = Eel” 
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Similarly, to determine the average position we have 


z(t) = £(0) + f dt! Z(t’) (3.9) 


From which we get 


EO) = #0) + | ae Ee) 
=f a — etm 
(0) + ™ #0) (1 ) 


Again, this is unsurprising: when the average noise vanishes, the average position of 
the particle coincides with that of a particle that didn’t experience any noise. 


Things get more interesting when we look at the expectation values of quadratic 
quantities. This includes the variance in position ((t)- Z(t) } and velocity (Z(t) Z(t) ), 
but also more general correlation functions in which the two quantities are evaluated at 
different times. For example, the correlation function ( t;(tı)t;(t2) ) tells us information 
about the velocity of the particle at time tə given that we know where its velocity at 
time tı. From (3.8), we have the expression, 


(tlia) = (alts) Kisla) + 5 / “dt / t att, (EN) om 


where we made use of the fact that (f(£)) = 0 to drop the terms linear in the noise 
f. If we use the white noise correlation function (3.6), and assume tə > tı > 0, the 


integral in the second term becomes, 


2 ty 
(@4(t1)£(t2)) = (a(t) )( £5 (te) ) + -ma O e a eitim 
0 


D 
= (alts) (ilta) ) + Tay (ert — emaa) 
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For very large times, t,,t2 — oo, we can drop the last term as well as the average 
velocities since (Z(t) ) — 0. We learn that the correlation between velocities decays 
exponentially as 


D 
(@;(t1)£5(t2)) —> att e-Va-t)/m 
m 


This means that if you know the velocity of the particle at some time tı, then you can 
be fairly confident that it will have a similar velocity at a time t2 < tı + m/y later. 
But if you wait longer than time m/y then you would be a fool to make any bets on 
the velocity based only on your knowledge at time t. 


TEE 


Finally, we can also use this result to compute the average velocity-squared (which, 
of course, is the kinetic energy of the system). At late times, the any initial velocity 
has died away and the resulting kinetic energy is due entirely to the bombardment by 
the environment. It is independent of time and given by 
a (3.10) 
m 

One can compute similar correlation functions for position (2;(t1)x;(t2)). The ex- 
pressions are a little more tricky but still quite manageable. (Combining equations 
(3.9) and (3.8), you can see that you will a quadruple integral to perform and figuring 
out the limits is a little fiddly). At late times, it turns out that the variance of the 
position is given by the same expression that we saw for the viscous liquid (3.7), 


( (Z(t) — #(0))?) =6Dt (3.11) 
again exhibiting the now-familiar vt behaviour for the root-mean-square distance. 


3.1.3 The Einstein Relation 


We brushed over something important and lovely in the previous discussion. We com- 
puted the average kinetic energy of a particle in (3.10). It is 


I y& 2 3 
E=-m£-£)=-—D 
set) = 5Dy 
But we already know what the average energy of a particle is when it’s bombarded by 
its environment: it is given by the equipartition theorem and, crucially, depends only 


on the temperature of the surroundings 


3 
E = -kgT 
5B 
It must be therefore that the diffusion constant D is related to the mobility 1/y by 
kgT 
po- (3.12) 
Y 


That’s rather surprising! The diffusion constant captures the amount a particle is 
kicked around due to the background medium; the mobility expresses the how hard it 
is for a particle to plough through the background medium. And yet they are related. 
This equation is telling us that diffusion and viscosity both have their microscopic origin 
in the random bombardment of molecules. Notice that D is inversely proportional to 
y. Yet you might have thought that the amount the particle is kicked increases as the 
viscosity increases. Indeed, looking back at (3.6), you can see that the amount the 
particle is kicked is actually proportional to Dy? ~ Ty. Which is more in line with our 
intuition. 
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Equation (3.12) is known as the Einstein relation. It is an important example of 
the fluctuation-dissipation theorem. The fluctuations of the particle as it undergoes 
its random walk are related to the drag force (or dissipation of momentum) that the 
particle feels as it moves through the fluid. 


The Einstein relation gives an excellent way to determine Boltzmann’s constant ex- 
perimentally. Watch a particle perform a Brownian jitter. After time t, the distance 
travelled by the particle (3.7) should be 


kpT 
(z3 = 24 
mna 
where we have used the Stokes formula y = 6rna to relate the mobility to the viscosity 
u and radius a of the particle. This experiment was done in 1909 by the French physicist 


Jean Baptiste Perrin and won him the 1926 Nobel prize. 


3.1.4 Noise Probability Distributions 


So far, we’ve only needed to use the two pieces of information about the noise, namely, 


(f(t)) =0 (3.13) 
( Filta) f5(t2)) = 2D776;j5(t1 — ta) (3.14) 


However, if we wanted to compute correlation functions involving more than two ve- 
locities or positions, it should be clear from the calculation that we would need to 
know higher moments of the probability distribution for fi (t). In fact, the definition of 
white noise is that there are no non-trivial correlations other than ( f;(t1) f;(t2) ). This 
doesn’t mean that the higher correlation functions are vanishing, just that they can be 


reduced to the two-time correlators. This means that for N even, 


(fati) fi (tw) ) = Sa (t) fin (to) )--- Sin- (Eni) fix (tw) ) + permutations 
while, for N odd 


( fix(ti) .-. fiy(tw)) = 0 


Another way of saying this is that all but the second cumulant of the probability 
distribution vanish. 


Instead of specifying all these moments of the distribution, it is often much more 


=> 


useful to specify the probability distribution for f(t) directly. However, this is a slightly 


subtle object because we want to specify the probability for an entire function fl (t), 
rather than a single random variable. This means that the probability distribution 


must be a functional: you give it a function f(t) and it spits back a number which, in 
this case, should be between zero and one. 
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The good news is that, among the class of probability distributions over functions, 
the white noise distribution is by far the easiest! If we were dealing with a single 
random variable, the distribution that has only two-point correlators but no higher is 
the Gaussian. And, suitably generalised, this also works for our functional probability 
distribution. The probability distribution that gives white noise is 


Prob[f(t)] = N exp (- [oe aw) 


—oco 


where M is a normalization factor which is needed to ensure that the sum over all 


probabilities gives unity. This “sum” is really a sum over all functions f(t) or, in other 
words, a functional integral. The normalization condition which fixes M is then 


f Df(t) Prob[f(t)] = 1 (3.15) 


With this probability distribution, all averaging over the noise can now be computed 
as a functional integral. If you have any function g(x), then its average is 


(g(x)) =N | Df glæp) e7 S% F/4DY 


where the notation x; means the solution to the Langevin equation in the presence of 
a fixed source f. 

Let’s now show that the Gaussian probability distribution indeed reproduces the 
white noise correlations as claimed. To do this, we first introduce an object Z[.J(t)| 
known as a generating function. We can introduce a generating function for any prob- 
ability distribution, so let’s keep things general for now and later specialise to the 
Gaussian distribution. 


ZIO = f DIO Prose { f Maere o) 


—co 


> 


This generating function is a functional: it is a function of any function J(t) that we 
care to feed it. By construction, Z[0] = 1, courtesy of (3.15). 


As the notation Z suggests, the generating function has much in common with the 
partition function that we work with in a first course of Statistical Mechanics. This is 
most apparent in the context of statistical field theories where the generating function 
is reminiscent of the partition function. Both are functional, or path, integrals. These 
objects are also important in quantum field theory where the names partition function 
and generating function are often used synonymously. 
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The function J that we have introduced is, in this context, really little more than 
a trick that allows us to encode all the correlation functions in Z[J]. To see how this 
works. Suppose that we differentiate Z with respect to J evaluated at some time t = tı 
and then set J = 0. We have 
A 
= | Dft) fiti) Probl ft) = (fit 
TE pg 7 J PEO ft) Probl = (lt) 


Playing the same game, first taking n derivatives, gives 


STE T aT TED |, S PA Eal) Fite) DO 
= ( fa (ti) fiz (2) --- fin (tn) ) 


= 


So we see that if we can compute Z|J], then successive correlation functions are simply 


the coefficients of a Taylor expansion in J. This is particularly useful for the Gaussian 
distribution where the generating function is, 


ZIJE =N | Df exp (-/ wat Oe | ie: ro) 


OO 


But this is nothing more than a Gaussian integral. (Ok, it’s an infinite number of 
Gaussian integrals because it’s a functional integral. But we shouldn’t let that phase 
us). We can easily compute it by completing the square 

+00 


ZIJOAN f proe (- J odi [O -2DP 7o] -4D F(t) 79) 


After the shift of variable, F > f= 2D, the integral reduces to (3.15), leaving 
behind 
> oo => > 
ZO = exp (Dy? fat Fw) Fe) 
Now it is an easy matter to compute correlation functions. Taking one derivative, we 
have 


0Z 5 
—— =2D7 Ji(ti) Z 
Sig) 72D Mh) Z 
But this vanishes when we set J = 0, in agreement with our requirement (3.13) that 
the average noise vanishes. Taking a second derivative gives, 
672 = 
—— ee eS ID a — ta ZIJ] A) Z 
6J;(t1)0J; (ta) Y Oy (t 2) [J] + Y (t1) i( 2) | | 
Now setting J = 0, only the first term survives and reproduces the white noise corre- 
lation (3.14). One can continue the process to see that all higher correlation functions 
are entirely determined by (f; fj ). 


— 61 = 


3.1.5 Stochastic Processes for Fields 


Finally, it’s worth mentioning that Langevin-type equations are not restricted to par- 
ticle positions. It is also of interest to write down stochastic processes for fields. For 
example, we may want to consider a time dependent process for some order parameter 
m(r,t), influenced by noise 

om 


ap =V m-am- 2bm* + f 


where f(7,t) is a random field with correlations (f) = 0 and 
(F(Fi ta) f (7, t2) ) ~ OF — 72)0 (tr — te) 


A famous example of a stochastic process is provided by the fluctuating boundary 
between, say, a gas and a liquid. Denoting the height of the boundary as h(7,t), the 
simplest description of the boundary fluctuations is given by the Edwards- Wilkinson 
equation, 


Oh A 
L = Vh 
Ji Vht+f 
A somewhat more accurate model is given by the Kardar-Parisi-Zhang equation, 
Oh 


We won’t have anything to say about the properties of these equations in this course. 
An introduction can be found in the second book by Kardar. 


3.2 The Fokker-Planck Equation 


Drop a particle at some position, say o at time to. If the subsequent evolution is noisy, 
so that it is governed by a stochastic Langevin equation, then we’ve got no way to know 
for sure where the particle will be. The best that we can do is talk about probabilities. 
We will denote the probability that the particle sits at 7 at time t as P(Z,t; Zo, to). 


In the previous section we expressed our uncertainty in the position of the particle 
in terms of correlation functions. Here we shift perspective a little. We would like to 
ask: what probability distribution P(Z,t; Zo, to) would give rise to the same correlation 
functions that arose from the Langevin equation? 


We should stress that we care nothing about the particular path z(t) that the particle 
took. The probability distribution over paths would be a rather complicated functional 
(rather like those we saw in Section 3.1.4). Instead we will ask the much simpler 
question of the probability that the particle sits at z% at time t, regardless of how it got 
there. 
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It is simple to write down a formal expression for the probability distribution. Let’s 
denote the solution to the Langevin equation for a given noise function f as Xp. Of 
course, if we know the noise, then there is no uncertainty in the probability distribution 
for Z. It is simply P(#,t) = (X — Zs). Now averaging over all possible noise, we can 
write the probability distribution as 


P(@,t) = (6(# — Z;)) (3.16) 


In this section, we shall show that the P(#,t) obeys a simple partial differential equation 
known as the Fokker-Planck equation. 


3.2.1 The Diffusion Equation 


The simplest stochastic process we studied was a particle subject to random forces in 
a very viscous fluid. The Langevin equation is 


> 


x(t) = = f(t) 


2 |r 


In Section 3.1.1 we showed that the average position of the particle remains unchanged: 
if Z(t = 0) = 0 then (z(t)) = 0 for all t. But the variance of the particle undergoes a 
random walk (3.7), 


(Z) = 6Dt (3.17) 


For this simple case, we won’t derive the probability distribution: we’ll just write it 
down. The probability distribution that reproduces this variance: it is just a Gaussian 


3/2 
pap - (te —#?/4Dt 3.18 
(X,t) T e (3.18) 


where the factor out front is determined by the normalization requirement that 


fë P(x,t)=1 


for all time t. Note that there is more information contained in this probability dis- 
tribution that the just the variance (3.17). Specifically, all higher cumulants vanish. 
(This means, for example, that (z3) = 0 and (#4) =3( Z°} and so on). But it simple 
to check that this is indeed what arises from the Langevin equation with white noise 
described in Section 3.1.4. 
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The probability distribution (3.18) obeys the diffusion equation, 


OP 

= = DV’P 

ot 
This is the simplest example of a Fokker-Planck equation. However, for more com- 
plicated versions of the Langevin equation, we will have to work harder to derive the 


analogous equation governing the probability distribution P. 


3.2.2 Meet the Fokker-Planck Equation 


Let’s now consider the a more general stochastic process. We’ll still work in the viscous 
limit for now, setting m = 0 so that we have a first order Langevin equation, 


yž=-VV +f (3.19) 


A quadratic V corresponds to a harmonic oscillator potential and the Langevin equation 
is not difficult to solve. (In fact, mathematically it is the same problem that we solved 
in Section 3.1.2. You just have to replace 7 = 0 —> 7). Any other V gives rise to a non- 
linear stochastic equation (confusingly sometimes called “quasi-linear” in this context) 
and no general solution is available. Nonetheless, we will still be able to massage this 
into the form of a Fokker-Planck equation. 


We begin by extracting some information from the Langevin equation. Consider a 
particle sitting at some point x at time t. If we look again a short time ôt later, the 
particle will have moved a small amount 


i 1 1 t+ôt 3 
ôt =r0t = = VV ôt + 7 f dt’ f(t’) (3.20) 
t 


Here we’ve taken the average value of the noise function, f, over the small time interval. 
However, we’ve assumed that the displacement of the particle 67 is small enough so 
that we can evaluate the force VV at the original position 7. (It turns out that this 
is ok in the present context but there are often pitfalls in making such assumptions in 
the theory of stochastic processes. We’ll comment on one such pitfall at the end of this 
Section). We can now compute the average. Because ( fi (t)) = 0, we have 


1 
(32) = -— VV öt (3.21) 


The computation (ôx; 6x;) is also straightforward, 


t+ôt 


+ fa i "f oT (filt!) H(t") 
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Both the first two terms are order 6t?. However, using (3.6), one of the integrals in 
the third term is killed by the delta function, leaving just one integral standing. This 
ensures that the third term is actually proportional to ôt, 


(6x;62x;) = 26,;D ôt + O(t’) (3.22) 


We will ignore the terms of order ôt?. Moreover, It is simple to see that all higher 
correlation functions are higher order in dt. For example, (Z^) ~ ot?. These too will 
be ignored. 


Our strategy now is to construct a probability distribution that reproduces (3.21) 
and (3.22). We start by considering the conditional probability P(Z,t + ôt; £’, t) that 
the particle sits at # at time t + ôt given that, a moment earlier, it was sitting at 7’. 
From the definition (3.16) we can write this as 


P(&,t + ôt; Z',t) = (6(Z — Z' — 62)) 


where dx is the random variable here; it is the distance moved in time ôt. Next, 
we do something that may look fishy: we Taylor expand the delta-function. If you’re 
nervous about expanding a distribution in this way, you could always regulate the delta 
function in your favourite manner to turn it into a well behaved function. However, 
more pertinently, we will see that the resulting expression sits inside an integral where 
any offending terms make perfect sense. For now, we just proceed naively 


e F ð 1 o? as 
We have truncated at second order because we want to compare this to (3.27) and, as 
we saw above, (67) and (6%”) are the only terms that are order ôt. 


We now have all the information that we need. We just have to compare (3.27) and 
(3.23) and figure out how to deal with those delta functions. To do this, we need one 
more trick. Firstly, recall that our real interest is in the evolution of the probability 
P(#,t; Zo, to), given some initial, arbitrary starting position Z(t = to) = Zo. There is an 
obvious property that this probability must satisfy: if you look at some intermediate 
time ty < t < t, then the particle has to be somewhere. Written as an equation, this 
“has to be somewhere” property is called the Chapman-Kolmogorov equation 


+00 
P(T, t; Zo, to) = f da" Paka tv) PZ: Zo, to) (3.24) 
Replacing t by t + dt, we can substitute our expression (3.23) into the Chapman- 
Kolmogorov equation, and then integrate by parts so that the derivatives on the delta 
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function turn and hit P(2z’,t’; Zo, to). The delta-function, now unattended by deriva- 
tives, kills the integral, leaving 


P(E t + öt; Ho, to) = P(E, t; Zo, to) — 2 ( (das) P(E, t; To, to)) 


1 i da 


Using our expressions for (dx) and (6762) given in (3.21) and (3.22), this becomes 


7 + MOE 1-9 POV arras 
P(Z,t + ôt; To, to) = P(T, t; čo, to) + yÔ; (Z P(E, Zot) ôt 
82 


But we can also get a much simpler expression for the left-hand side simply by Taylor 
expanding with respect to time, 


0 


Equating (3.27) with (3.26) gives us our final result, 


OP 1 
—=—V-(P DV? P 9 
a ay (PVV)+ DV (3.28) 


This is the Fokker-Planck equation. This form also goes by the name of the Smolu- 
chowski equation or, for probabilists, Kolomogorov’s forward equation. 


Properties of the Fokker-Planck Equation 


It is useful to write the Fokker-Planck equation as a continuity equation 


OP > 
=V. 3.29 
ap (3.29) 
where the probability current is 
> 1 
J = -PVV + DVP (3.30) 
Y 


The second term is clearly due to diffusion (because there’s a big capital D in front of 
it); the first term is due to the potential and is often referred to as the drift, meaning 
the overall motion of the particle due to background forces that we understand. 
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One advantage of writing the Fokker-Planck equation in terms of a current is that 
we see immediately that probability is conserved, meaning that if f d’z P = 1 at some 
point in time then it will remain so for all later times. This follows by a standard 


argument, 
o ey. 3 OP 3 > 


where the last equality follows because we have a total derivative (and we are implicitly 
assuming that there’s no chance that the particle escapes to infinity so we can drop the 
boundary term). 


The Fokker-Planck equation tells us how systems evolve. For some systems, such as 
those described by the diffusion equation, there is no end point to this evolution: the 
system just spreads out more and more. However, for generic potentials V there are 
time-independent solutions to the Fokker-Planck equation obeying V - J = 0. These 
are the equilibrium configurations. The solution is given by 


P(#) ~ eV (#)/yD 


Using the Einstein relation (3.12), this becomes something very familiar. It is simply 
the Boltzmann distribution for a particle with energy V(#) in thermal equilibrium 


P(@) ~ eV @/kaP (3.31) 


Isn’t that nice! (Note that there’s no kinetic energy in the exponent as we set m = 0 
as our starting point). 


An Application: Escape over a Barrier 


As an application of the Fokker-Planck equation, consider thermal escape from the 
one-dimensional potential shown in Figure 7. We’ll assume that all the particles start 
off sitting close to the local minimum at 2,,j,. We model the potential close to this 
point as 
l 5 2 
V(x) ~ 5M min (2 — Tmin) 
and we start our particles in a distribution that is effectively in local equilbrium (3.31), 
with 
2 


w2. 2 2 
min p—Wf in(£—Tmin)/2kgT 3.32 
2rkgT i l ) 


Pir t=0)= 


=6 f= 


VED 


Figure 7: Escape over a Barrier. 


But, globally, £min is not the lowest energy configuration and this probability distribu- 
tion is not the equilibrium configuration. In fact, as drawn, the potential has no global 
minimum and there is no equilibrium distribution. So this isn’t what we’ll set out to 
find. Instead, we would like to calculate the rate at which particles leak out of the trap 
and over the barrier. 


Although we’re clearly interested in a time dependent process, the way we proceed is 
to assume that the leakage is small and so can be effectively treated as a steady state 
process. This means that we think of the original probability distribution of particles 
(3.32) as a bath which, at least on the time scales of interest, is unchanging. The steady 
state leakage is modelled by a constant probability current J = Jo, with J given by 
(3.30). Using the Einstein relation D = kgT/y, we can rewrite this as 


T= kBT -v(2)/ksT ð (e+V©)/ksT p) 
y Ox 


The first step is to integrate Jo et V(*)/keT between the minimum x,,;,, and some distance 
far from all the action, © >> %max, which we may as we call x = 2,, 


f dx Jae lke? — kel [eV @min)/BBT P(t nin) — eV C/T P(x,)] 
Tmin y 


But we can take the probability P(x„) to be vanishingly small compared to P(£min) 
given in (3.32), leaving us with 


a kpT | wÊ, 
dx J V(x)/kgT x WBE min 3.33 
a i ° i Y 2rkpT ( ) 
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Meanwhile, the integral on the left-hand-side is dominated by the maximum of the 
potential. Let’s suppose that close to the maximum, the potential looks like 


1 


V(x) ~ Vinax — z 


nae = Dee) 
Then we'll write the integral as 


2rkgT 
w2 


max 


nf dx eV @)/kaT a Jy eVmax/kBT (3.34) 


Combining the two expressions (3.33) and (3.34), we get our final result for the rate of 
escape over the barrier 


Ji iy WminWmax e` Vmax/kBT 
217 


3.2.3 Velocity Diffusion 


So far we’ve ignored the inertia term, setting m = 0. Let’s now put it back in. We can 
start by setting the potential to zero, so that the Langevin equation is 


m? = —7F + f(t) 


But, we can trivially rewrite this as a first order equation involving 0 = z, 
mb = -77 + f(t) 

This means that if we’re only interested in the distribution over velocities, P(v,t), then 
we have exactly the same problem that we’ve just looked at, simply replacing 7 => Vv 
and y > m. (Actually, you need to be a little more careful. The diffusion constant 
D that appears in (3.28) was really Dy?/7? where the numerator arose from the noise 
correlator and the denominator from the yz term in the Langevin equation. Only the 
latter changes, meaning that this combination gets replaced by Dy?/m?). The resulting 
Fokker-Planck equation is 


+ (3.35) 


The equilibrium distribution that follows from this obeys 0P/Ot = 0, meaning 


OP m m 3/2 2 
n == Pý => P = -mü /2kgT 
ð Dy” (zr) © 


where we’ve again used the Einstein relation Dy = kgT. This, of course, is the Maxwell- 
Boltzmann distribution. 
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In fact, we can do better than this. Suppose that we start all the particles off at 
t = 0 with some fixed velocity, Y = vo. This mean that the probability distribution is 
a delta-function, P(v,t = 0) = (v — vo). We can write down a full time-dependent 
solution to the Fokker-Planck equation (3.35) with this initial condition. 


Pt) = ( m Oe = (- mB — e7) ) 


2rkgpT(1 — e7?t/m 2kgT(1 — e72t/m) 


As t + oo, we return to the Maxwell-Boltzmann distribution. But now this tells us 
how we approach equilibrium. 
The Kramers-Chandrasekhar Fokker-Planck Equation 


As our final example of a Fokker-Planck equation, we can consider the Langevin equa- 
tion with both acceleration term and potential term, 


mr = =y — VV + Fit) 


Now we are looking for a probability distribution over phase space, P(Z, Z, t). The 
right way to proceed is to write this as two first-order equations. The first of these is 
simply the definition of velocity v = x. The second is the Langevin equation 


mv = -y7 — VV + fit) 


These can now be combined into a single Langevin equation for six variables. Once 
armed with this, we need only follow the method that we saw above to arrive at a 
Fokker-Planck equation for P(X, v,t), 


ð 2 9 0 a ie pV Dy? °P 
(3 a Z) n m Ovi (r Pa p) + m? viðv’ (336) 


This form of the Fokker-Planck equations is sometimes called the Kramers equation 
and sometimes called the Chandrasekhar equation. 


Note that this equation is now capturing the same physics that we saw in the Boltz- 
mann equation: the probability distribution P(g, U, t) is the same object that we called 
fi (7, p: t) in Section 2. Moreover, it is possible to derive this form of the Fokker-Planck 
equation, starting from the Boltzmann equation describing a heavy particle in a sur- 
rounding bath of light particles. The key approximation is that in small time intervals 
ôt, the momentum of the heavy particle only changes by a small amount. Looking back, 
you can see that this was indeed an assumption in the derivation of the Fokker-Planck 
equation in Section 3.2.2, but not in the derivation of the Boltzmann equation. 
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Integrating over Velocity 


The equation (3.36) governing the probability distribution over phase space P(g, Uv, t) 
looks very different from the Fokker-Planck equation governing the probability distri- 
bution over configuration space (3.28). Yet the related Langevin equations are simply 
related by setting m = 0 or, equivalently, looking at systems with large y. How can we 
derive (3.28) from (3.36)? 


The computation involves a careful expansion of (3.36) in powers of 1/7. Let’s see 
how this works. Firstly, we use the Einstein relation to write Dy = kgT, and the 
rearrange the terms in (3.36) to become 


ð (kT ô v 170 0 10V ə 
Out ( m? Ovt = =) = y Ç te drt = maxi 2) a en) 


We're going to solve this perturbatively in 1/y, writing 


P= PO 4 


As our first pass at this, we drop anything that has a 1/7, which mean that P) must 
be annihilated by the left-hand-side of (3.37). and This is a simple differential equation, 
with solution 


p®% (v, x,t) = eo mv? /2kBT p (x, t) 


for any function ¢® (x,t). Of course, the velocity dependence here is simply the 
Maxwell-Boltzmann distribution. To figure out what restrictions we have on 6, we 
need to go to the next order in perturbation theory. Keeping terms of O(1/7), the 
differential equation (3.37) becomes 


o kgT o vt o ; o yt OV 2 
i : PY — i (0) -mvt /2kgT : 
ov’ ( m Ov = =) Ç H ag j kpT a9 os ee) 


But this equation cannot be solved for arbitrary ©. This is simplest to see if we just 
integrate both sides over f du: the left-hand-side is a total derivative and so vanishes. 
On the right-hand-side, only one term remains standing and this must vanish. It is 


0d 

a 
Ot 

So ¢ = 6 (x). With this constraint, the solution to (3.38) is, again, straightforward 

to write down. It is 


,09 m wi 
Ox? kpT Ox? 


PY (x, v, t) = (~m pO | oen) eo mv? /2kpT 


= Fl = 


At this point, it doesn’t look like we’re making much progress. We still don’t know what. 
© (x) is and we’ve now had to introduce yet another arbitrary function, 6 (x, t) which 
carries all the time dependence. Let’s plug this once more into (3.37), now working to 
order O(1/7y7). After a little bit of algebra, the equation becomes 


o (= 0 | 2) pe = [moive ( a : me 
g? 


Ov' \ m? Avi m kpT ðxİ 


ƏV m ; \ 3 18V 
ees Le tay | (0) 
Ou! (6, fer w) (2 ket X) d 


ð haji ð l vt OV (1) | e7mv?/2kBT 
= Oxi kpT Oxi 


Once again, there’s a consistency condition that must be realised if this equation is 


to have a solution. Integrating over f dv, the left-hand-side is a total derivative and 
therefore vanishes. Any term linear in v on the right-hand-side also vanishes. But so 
too do the terms on the second line: you can check that the Gaussian integral over the 
ij term exactly cancels the v'v? term. The resulting consistency condition is 
ag -kT ( ð ıl go 
Ot Ox' \ Oat kpT Ox 


where the overall factor of kgT on the right-hand-side comes only arises when you do 


(3.39) 


the Gaussian integral over f dv. 


Now we're almost there. (Although it probably doesn’t feel like it!). Collecting what 
we’ve learned, to order O(1/y), the probability distribution over phase space takes the 
form 


P(z,v,t) = (0 ae oe e OY Xa £) paT 

y ðr?  ykpBT on 
But to make contact with the earlier form of the Fokker-Planck equation (3.28), we 
want a distribution over configuration space. We get this by simply integrating over 
velocities. We’ll also denote the resulting probability distribution as P(x, t), with only 
the arguments to tell us that it’s a different object: 


P(x,t) = [ee P(x, v, t) =4/ -— (0%) + TO n) 


But now we can use the consistency condition (3.39) to compute 0P/0t. Working only 
to order O(1/y), this reads 

ƏP kpT ð (= o1 a9 

ot y On \ dz! ' kpT Oxi 
Which is precisely the Fokker-Planck equation (3.28) that we saw previously. 
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3.2.4 Path Integrals: Schrodinger, Feynman, Fokker and Planck 


There is a close similarity between the Fokker-Planck equation and the Schrodinger 
equation in quantum mechanics. To see this, let’s return to the first order Langevin 
equation 


j= - (-vv 4 F) (3.40) 


and the corresponding Fokker-Planck equation (3.28). We can change variables to 


P(x, t) = eV OPP P(x, t) (3.41) 
Substituting into the Fokker-Planck equation, we see that the rescaled probability P 
obeys 
aP ? 1 Í : 
— = DV’P + | —V?°V — vV) ] P 3.42 
ot ves (5-9 4%? D (v) (3:42) 


There are no first order gradients VP; only V?P. This form of the Fokker-Planck 
equation looks very similar to the Schrodinger equation. 


ee 
ae V UUN 


All that’s missing is a factor of 7 on the left-hand-side. Otherwise, with a few trivial 
substitutions, the two equations look more or less the same. Note, however, that 
the relationship between the potentials is not obvious: if we want to relate the two 
equations, we should identify 


1 


ID? (VV) (3.43) 


1 
U=-—Vv’V 4 

27 
The relationship between the evolution of quantum and classical probabilities is also 
highlighted in the path integral formulation. Recall that the Schrödinger equation 


can be reformulated in terms of function integrals, with the quantum amplitude for a 
particle to travel from 7 = 7; at time t = t; to Zy at time ty given by®. 


š >2 
(Epis fut) = N f Dale) exp (ifa — — va) 


8A derivation of the path integral from the Schrödinger equation can be found in the lectures on 


Classical Dynamics. 
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Figure 8: From Chapman-Kolmogorov to Feynman. 


where M is a normalization factor. Here the integral is over all paths which start at 
(Zi, ti) and end at ¢;,t;). By analogy, we expect there to be a similar path integral 
formulation of the classical probability for a particle in the Langevin environment (3.40) 
to travel from 7; to xs. Indeed, the existence of a path integral formulation for this 
problem is very natural. The essence of this can already be seen in the Chapman- 
Kolmogorov equation (3.24) 
+00 
P@ t; Zo, to) = / Gm P(T, t; 7, t) P(T, t'; Zo, to) 

This simply says that to get from point A to point B, a particle has to pass through 
some position in between. And we sum up the probabilities for each position. Adding 
many more intervening time steps, as shown in Figure 8, naturally suggests that we 
should be summing over all possible paths. 


Deriving the Path Integral 
Here we will sketch the derivation of the path integral formula for the Fokker-Planck 


equation. We’ve already met function integrals in Section 3.1.4 where we introduced 


> 


the probability distribution for a given noise function f(t) 


Prob[f(t)] = MN exp (- dt oe) (3.44) 


subject to the normalization condition 


f Df (t) Prob[f(t)] = 1 (3.45) 


> 


But given a fixed noise profile f(t) and an initial condition, the path of the particle is 
fully determined by the Langevin equation (3.40). Let’s call this solution 7s. Then the 
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probability that the particle takes the path a, is the same as the probability that the 
force is f, 


Probz;(t)] = Prob[f(t)] = N exp (- dt A) 


= N exp (m> fe (YEr + vvn) 


where, in the last line, we’ve used the Langevin equation (3.40) to relate the force to 
the path taken. But since this equation holds for any path ¥, we can simply drop the 
f label. We have the probability that the particle takes a specific path Z(t) given by 


Prob[Z(t)] = M exp (m I dt (ya + vv?) 


The total probability to go from 7; to Zs should therefore just be the sum over all these 
paths. With one, slightly fiddly, subtlety: the probability is normalized in (3.45) with 
respect to the integration measure over noise variable f. And we want to integrate over 
paths. This means that we have to change integration variables and pick up a Jacobian 
factor for our troubles. We have 


Probl p, tj; Zi ti] = N f DHE exp (m fë (yz + vv(z?) 


=N f Da(t) det M exp (m / dt (ya + vy) (3.46) 


Here the operator M(t, t’) that appears in the Jacobian be thought of as df(t) /dx(t’). 
It can be written down by returning to the Langevin equation (3.40) which relates fand 
x, 

ð / 2 1 
=y tty Vô(t-— t’) 


If we want to think in a simple minded way, we can consider this as a (very large) 


M(t, t’) 


matrix Mw, with columns labelled by the index t and rows labelled by t’. We’ll write 
the two terms in this matrix as M = A + B so the determinant becomes 


det(A + B) = det A det(1 + AB) (3.47) 


The first operator A = yO/Ot å(t — t’) doesn’t depend on the path and its determinant 
just gives a constant factor which can be absorbed into the normalization M. The 
operator AT! in the second term is defined in the usual way as 


ju A(t, ATHE t”) = 6(¢ — t") 
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where the integral over f dt’ is simply summing over the rows of A and the columns of 
A`! as in usual matrix multiplication. It is simple to check that the inverse is simply 
the step function 


1 
ATHE t = 0e — t") (3.48) 
Y 
Now we write the second factor in (3.47) and expand, 


det(1 + A7'B) = exp (Tr log(1 + A'B) ) = exp D Tr(47B}"/n) (3.49) 


Here we should look in more detail at what this compact notation means. The term 
Tr ATİB is really short-hand for 


Tr A'B = faw A(t, t) Bt, t) 


where the integral over f dt’ is multiplying the matrices together while the integral over 
J dt comes from taking the trace. Using (3.48) we have 


1 
TrA “B= - fe dt’ O(t — t)V?V ô(t — t) = a fe VV 


The appearance of 6(0) may look a little odd. This function is defined to be 6(x) = +1 
for x > 0 and (x) = 0 for x < 0. The only really sensible value at the origin is 
6(0) = 1/2. Indeed, this follows from the standard regularizations of the step function, 
for example 


1 1 x 1 
sim pom ( = S- 
O(a) tim (5 + 2 an (2) => 6(0) 3 


What happens to the higher powers of (A~'B)"? Writing them out, we have 


Tr(A7'B)" = fufas ...dtan—1 O(t =% 0 — t2)0 (t2 — t3)ð (ts — t4) --. 
(V?V)" 
ry” 


.. O(ton—2 — ton—1)6(ton—1 — t) 


where we have been a little sloppy in writing (V?V)” because each of these is actually 

computed at a different time. We can use the delta-functions to do half of the integrals, 

say all the tn for n odd. We get 

(v*V)” 
nytt 


Tr(A“B)* = [et f dizdi ... O(t — t2)0(t2 — ta)O(ta — te)... O(ten—2 — t) 
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But this integral is only non-vanishing only if t > tə > t4 > tg >... >t, >t. In 
other words, the integral vanishes. (Note that you might think we could again get 
contributions from 0(0) = 1/2, but the integrals now mean that the integrand has 
support on a set of zero measure. And with no more delta-functions to rescue us, gives 
zero. The upshot of this is that the determinant (3.49) can be expressed as a single 
exponential 


det(1 + A7'B) = exp (> fu vv) 


We now have an expression for the measure factor in (3.46). Using this, the path 
integral for the probability becomes, 


1 . 1 


>2 
— N'e e)-V e2 f Da(t) exp [- J ope 


where U is given in (3.43). Notice that the prefactor eV @r)-V(%:)]/2P takes the same 
form as the map from probabilities P to the rescaled P in (3.41). This completes our 
derivation of the path integral formulation of probabilities. 


3.2.5 Stochastic Calculus 


There is one final generalization of the Langevin equation that we will mention but 
won’t pursue in detail. Let’s return to the case m = 0, but generalise the noise term 
in the Langevin equation so that it is now spatially dependent. We write 


ya = —VV + d(z) f(t) (3.50) 


This is usually called the non-linear Langevin equation. The addition of the b(#) multi- 
plying the noise looks like a fairly innocuous change. But it’s not. In fact, annoyingly, 
this equation is not even well defined! 


The problem is that the system gets a random kick at time t, the strength of which 
depends on its position at time t. But if the system is getting a delta-function impulse 
at time t then its position is not well defined. Mathematically, this problem arises when 
we look at the position after some small time ôt. Our equation (3.20) now becomes 

i 1 1 tte T 
0L =L= -vva f dt bZ) FE) 
Y Y Jt 
and our trouble is in making sense of the last term. There are a couple of obvious ways 
we could move forward: 


DE 


e Ito: We could insist that the strength of the kick is related to the position of the 
particle immediately before the kick took place. Mathematically, we replace the 
integral with 


t+ôt 
| wae o e [av Fe) 
t t 
This choice is known as Ito stochastic calculus. 


e Stratonovich: Alternatively, we might argue that the kick isn’t really a delta 
function. It is really a process that takes place over a small, but finite, time. To 
model this, the strength of the kick should be determined by the average position 
over which this process takes place. Mathematically, we replace the integral with, 


> 


t+ôt 1 t+ôt 
f dt' bZ) f(t’) — 3 [b(z(t + dt)) +o) )] f dt f(t) 
t t 
This choice is known as Stratonovich stochastic calculus. 


Usually in physics, issues of this kind don’t matter too much. Typically, any way of 
regulating microscopic infinitesimals leads to the same macroscopic answers. However, 
this is not the case here and the Ito and Stratonovich methods give different answers 
in the continuum. In most applications of physics, including Brownian motion, the 
Stratonovich calculus is the right way to proceed because, as we argued when we first 
introduced noise, the delta-function arising in the correlation function (f(t) f(t’)) is just 
a convenient approximation to something more smooth. However, in other applications 
such as financial modelling, Ito calculus is correct. 


The subject of stochastic calculus is a long one and won’t be described in this course. 
For the Stratonovich choice, the Fokker-Planck equation turns out to be 


OP 


1 
—o ~ - [P(VV — Dy*bVb)] + DV? (b° P) 


This is also the form of the Fokker-Planck equation that you get by naively dividing 
(3.50) by b(Z) and the defining a new variable j = z/b which reduces the problem to 
our previous Langevin equation (3.19). In contrast, if we use Ito stochastic calculus, 
the bVb term is absent in the resulting Fokker-Planck equation. 
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4. Linear Response 


The goal of response theory is to figure out how a system reacts to outside influences. 
These outside influences are things like applied electric and magnetic fields, or applied 
pressure, or an applied driving force due to some guy sticking a spoon into a quantum 
liquid and stirring. 


We've already looked at a number of situations like this earlier in these lectures. If 
you apply a shearing force to a fluid, its response is to move; how much it moves is 
determined by the viscosity. If you apply a temperature gradient, the response is for 
heat to flow; the amount of heat is determined by the thermal conductivity. However, in 
both of these cases, the outside influence was time independent. Our purpose here is to 
explore the more general case of time dependent influences. As we’ll see, by studying 
the response of the system at different frequencies, we learn important information 
about what’s going on inside the system itself. 


4.1 Response Functions 


Until now, our discussion has been almost entirely classical. Here we want to deal with 
both classical and quantum worlds. For both cases, we start by explaining mathemat- 
ically what is meant by an outside influence on a system. 


Forces in Classical Dynamics 


Consider a simple dynamical system with some generalized coordinates x;(t) which 
depend on time. If left alone, these coordinates will obey some equations of motion, 


This dynamics need not necessarily be Hamiltonian. Indeed, often we’ll be interested 
in situations with friction. The outside influence in this example arises from perturbing 
the system by the addition of some driving forces F;(t), so that the equations of motion 
become, 


Xi + gilt, x) = F;,(t) (4.1) 


In this expression, x;(t) are dynamical degrees of freedom. This is what we’re solving 
for. In contrast, F(t) are not dynamical: they’re forces that are under our control, like 
someone pulling on the end of a spring. We get to decide on the time dependence of 
each F;(t). 
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It may be useful to have an even more concrete example at the back of our minds. 
For this, we take every physicist’s favorite toy: the simple harmonic oscillator. Here 
we'll include a friction term, proportional to y, so that we have the damped harmonic 
oscillator with equation of motion 


ï+ yt +wx = F(t) (4.2) 
We will discuss this model in some detail in section 4.2. 


Sources in Quantum Mechanics 


In quantum mechanics, we introduce the outside influences in a slightly different man- 
ner. The observables of the system are now operators, O;. We’ll work in the Heisenberg 
picture, so that the operators are time dependent: O = O(t). Left alone, the dynamics 
of these operators will be governed by a Hamiltonian H(O). However, we have no 
interest in leaving the system alone. We want to give it a kick. Mathematically this is 
achieved by adding an extra term to the Hamiltonian, 


H source (t) = Qi (t) O; (t) (4.3) 


The ¢;(x) are referred to as sources. They are external fields that are under our 
control, analogous to the driving forces in the example above. Indeed, if we take a 
classical Hamiltonian and add a term of the form x@ then the resulting Euler-Lagrange 
equations include the source ¢ on the right-hand-side in the same way that the force 
F appears in (4.2). 


4.1.1 Linear Response 


We want to understand how our system reacts to the presence of the source or the 
driving force. To be concrete, we'll chose to work in the language of quantum mechanics, 
but everything that we discuss in this section will also carry over to classical systems. 
Our goal is to understand how the correlation functions of the theory change when we 
turn on a source (or sources) ¢;(Z). 


In general, it’s a difficult question to understand how the theory is deformed by the 
sources. To figure this out, we really just need to sit down and solve the theory all over 
again. However, we can make progress under the assumption that the source is a small 
perturbation of the original system. This is fairly restrictive but it’s the simplest place 
where we can make progress so, from now on, we focus on this limit. Mathematically, 
this means that we assume that the change in the expectation value of any operator is 
linear in the perturbing source. We write 


5(O,(t)) = f dt! xlt t) o) (4.4) 
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Here x;;(t; t’) is known as a response function. We could write a similar expression for 
the classical dynamical system (4.1), where 6(O;) is replaced by x;(t) and ¢ is replaced 
by the driving force F;(t). In classical mechanics, it is clear from the form of the 
equation of motion (4.1) that the response function is simply the Green’s function for 
the system. For this reason, the response functions are often called Green’s functions 
and you'll often see them denoted as G instead of x. 


From now on, we'll assume that our system is invariant under time translations. In 
this case, we have 


xy (Gt) = xy- t') 


and it is useful to perform a Fourier transform to work in frequency space. We define 
the Fourier transform of the function f(t) to be 


dw 


5 FO) (4.5) 


fw) = f dee" F( and f= f 


In particular, we will use the convention where the two functions are distinguished only 
by their argument. 


Taking the Fourier transform of (4.4) gives 
6(O;(w)) = fa fu e™' yilt — t') olt) 


— fa fu gerl _ t) et lt’) 
= xij(w) ġ;(w) (4.6) 


We learn the response is “local” in frequency space: if you shake something at frequency 
w, it responds at frequency w. Anything beyond this lies within the domain of non- 
linear response. 


In this section we’ll describe some of the properties of the response function x(w) 
and how to interpret them. Many of these properties follow from very simple physical 
input. To avoid clutter, we'll mostly drop both the i, j indices. When there’s something 
interesting to say, we’ll put them back in. 


4.1.2 Analyticity and Causality 


If we work with a real source ¢ and a Hermitian operator O (which means a real 
expectation value (O)) then y(t) must also be real. Let’s see what this means for the 


=R] = 


Fourier transform x(w). It’s useful to introduce some new notation for the real and 
imaginary parts, 
x(w) = Re x(w) + tlm x(w) 
= y (w) + ix"w) 
This notation in terms of primes is fairly odd the first time you see it, but it’s standard 


in the literature. You just have to remember that, in this context, primes do not mean 
derivatives! 


The real and imaginary parts of the response function x(w) have different interpre- 
tations. Let’s look at these in turn 


e Imaginary Part: We can write the imaginary piece as 


xo) = fe) 0 
--i i - dt x(t)[e* — e] 
--i i. K dt "[y(t) — x(—t)] 


We see that the imaginary part of x(w) is due to the part of the response func- 
tion that is not invariant under time reversal t — —t. In other words, x”(w) 
knows about the arrow of time. Since microscopic systems are typically invariant 
under time reversal, the imaginary part x”(w) must be arising due to dissipative 
processes. 


x” (w) is called the dissipative or absorptive part of the response function. It is 
also known as the spectral function. It will turn out to contain information about 
the density of states in the system that take part in absorptive processes. We’ll 
see this more clearly in an example shortly. 


Finally, notice that x”(w) is an odd function, 
K (mu) = —x"(w) 


e Real Part: The same analysis as above shows that 
+00 


d= / dt l(t) + x(t) 


(oe) 
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The real part doesn’t care about the arrow of time. It is called the reactive part 
of the response function. It is an even function, 


x'(—w) = +x") 


Before we move on, we need to briefly mention what happens when we put the labels 
i, j back on the response functions. In this case, a similar analysis to that above shows 
that the dissipative response function comes from the anti-Hermitian part, 

_ 4 


Xy) = =z) = Gl) (4.7) 


Causality 
We can’t affect the past. This statement of causality means that any response function 
must satisfy 


xt)=0 forallt<0 


For this reason, y is often referred to as the causal Green’s function or retarded Green’s 
function and is sometimes denoted as Gr(t). Let’s see what this simple causality 
requirement means for the Fourier expansion of x, 


=f Pe yy) 


i Cae 


When t < 0, we can perform the integral by completing the contour in the upper-half 
place (so that the exponent becomes —iw x (—i|t|) — —oo). The answer has to be zero. 


Of course, the integral is given by the sum of the residues inside the contour. So if 
we want the response function to vanish for all t < 0, it must be that y(w) has no poles 
in the upper-half plane. In other words, causality requires: 


x(w) is analytic for Imw > 0 


4.1.3 Kramers-Kronig Relation 


The fact that x is analytic in the upper-half plane means that there is a relationship 
between the real and imaginary parts, x’ and x”. This is called the Kramers-Kronig 
relation. Our task in this section is to derive it. We start by providing a few general 
mathematical statements about complex integrals. 
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A Discontinuous Function 


First, consider a general function p(w). We’ll ask that p(w) is meromorphic, meaning 
that it is analytic apart from at isolated poles. But, for now, we won’t place any 
restrictions on the position of these poles. (We will shortly replace p(w) by x(w) which, 
as we’ve just seen, has no poles in the upper half plane). We can define a new function 
f(w) by the integral, 


LP UE 
flw) = =f Te W (4.8) 
Here the integral is taken along the interval w’ € [a,b] of the real line. However, when 
w also lies in this interval, we have a problem because the integral diverges at w = w. 
To avoid this, we can simply deform the contour of the integral into the complex plane, 
either running just above the singularity along w’ + ie or just below the singularity 
along w’ — ie. Alternatively (in fact, equivalently) we could just shift the position of 
the singularity to w + w F e. In both cases we just skim by the singularity and the 
integral is well defined. The only problem is that we get different answers depending 
on which way we do things. Indeed, the difference between the two answers is given by 
Cauchy’s residue theorem, 

flo + te) — Flw io] = ple) (4.9) 
The difference between f(w+ie) and f(w—ie) means that the function f(w) is discontin- 
uous across the real axis for w € [a,b]. If p(w) is everywhere analytic, this discontinuity 
is a branch cut. 


We can also define the average of the two functions either side of the discontinuity. 
This is usually called the principal value, and is denoted by adding the symbol P before 
the integral, 


[fw + ie) + flw—ie)| = =P | a) a (4.10) 


iT ww! — w 


NI rR 


We can get a better handle on the meaning of this principal part if we look at the real 
and imaginary pieces of the denominator in the integrand 1/[w’ — (w + ie)], 
1 7 W — w ie 
w—(wtie) (w -wte (w -w+ e 
By taking the sum of f(w+ ie) and f(w— ie) in (4.10), we isolate the real part, the first 
term in (4.11). This is shown in the left-hand figure. It can be thought of as a suitably 


(4.11) 


cut-off version of 1/(w’—w). It’s as if we have deleted an small segment of this function 
lying symmetrically about divergent point w and replaced it with a smooth function 
going through zero. This is the usual definition of the principal part of an integral. 
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Figure 9: The real part of the function Figure 10: The imaginary part of the 
(4.11), plotted with w’ = 1 and € = 0.5. function (4.11), plotted with w’ = 1 and 
€=0.5 


We can also see the meaning of the imaginary part of 1/(w’ — w), the second term 
in (4.11). This is shown in the right-hand figure. As e > 0, it tends towards a delta 
function, as expected from (4.9). For finite €, it is a regularized version of the delta 
function. 


Kramers-Kronig 


Let’s now apply this discussion to our response function x(w). We’ll be interested in 
the integral 


1 / 
T) du’ xw) wER (4.12) 
injo ww 


where the contour C skims just above the real axis, before closing at infinity in the 
upper-half plane. We’ll need to make one additional assumption: that x(z) falls off 
faster than 1/|z| at infinity. If this holds, the integral is the same as we consider in 
(4.8) with [a,b] —> [—o0, +00]. Indeed, in the language of the previous discussion, the 
integral is f(w — ie), with p = x. 


We apply the formulae (4.9) and (4.10). It gives 


fem ig= p| fae MV] -x 


T w! — w 


But we know the integral in (4.12) has to be zero since x(w) has no poles in the 
upper-half plane. This means that f(w — ie) = 0, or 


x(w) = 1p [- du’ x(w) (4.13) 
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“en 
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The important part for us is that factor of sitting in the denominator. Taking real 


and imaginary parts, we learn that 
tœ dus! Imy (w") 


Re x(w) = p | ar (4.14) 


co T Ww 


and 


Imy(w) = -P (4.15) 
These are the Kramers-Kronig relations. They follow from causality alone and tell 
us that the dissipative, imaginary part of the response function x”(w) is determined 
in terms of the reactive, real part, x'(w) and vice-versa. However, the relationship is 
not local in frequency space: you need to know y'(w) for all frequencies in order to 
reconstruct x” for any single frequency. 


There’s another way of writing these relations which is also useful and tells us how 
we can reconstruct the full response function x(w) if we only know the dissipative part. 
To see this, look at 


[- du’ Imy(w’) (4.16) 


7 : 
_o În w'—w-— ie 


where the —ze in the denominator tells us that this is an integral just below the real 
axis. Again using the formulae (4.9) and (4.10), we have 


i du’ Imy(w’) 


im Wl —W—teE 


+œ d 1 I 1 
= Imy(w) + p| a Ce) 


> : 
Lge. IT td = t= Ge 


= Imy(w) — i Rex(w) (4.17) 


=00) 


Or, rewriting as y(w) = Rex(w) + iImy(w), we get 


x(w) = f eea (4.18) 


; 
co T wW—w-—ie 


If you know the dissipative part of the response function, you know everything. 


An Application: Susceptibility 
Suppose that turning on a perturbation ¢ induces a response (Q) for some observable 
of our system. Then the susceptibility is defined as 
AO) 
Ob |o 
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We've called the susceptibility xy which is the same name that we gave to the response 
function. And, indeed, from the definition of linear response (4.4), the former is simply 
the zero frequency limit of the latter: 
x = lim x(w) 
w—0 
A common example, which we met in our first course in Statistical Mechanics, is the 


change of magnetization M of a system in response to an external magnetic field B. 
The aptly named magnetic susceptibility is given by xy = OM/OB. 


From (4.18), we can write the susceptibility as 


x= l eoo (4.19) 


is 
a T wie 


We see that if you can do an experiment to determine how much the system absorbs 
at all frequencies, then from this information you can determine the response of the 
system at zero frequency. This is known as the thermodynamic sum rule. 


4.2 Classical Examples 


The definitions and manipulations of the previous section can appear somewhat ab- 
stract the first time you encounter them. Some simple examples should shed some 
light. The main example we’ll focus on is the same one that accompanies us through 
most of physics: the classical harmonic oscillator. 


4.2.1 The Damped Harmonic Oscillator 
The equation of motion governing the damped harmonic oscillator in the presence of a 
driving force is 


i+ yi + wir = F(t) (4.20) 


Here y is the friction. We denote the undamped frequency as wo, saving w for the 
frequency of the driving force as in the previous section.. We want to determine the 
response function, or Green’s function, y(t — t) of this system. This is the function 
which effectively solves the dynamics for us, meaning that if someone tells us the driving 
force F(t), the motion is given by 


at) = [- dt' y(t — t) F(t’) (4.21) 


oe) 
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Figure 11: The real, reactive part of the response function for the underdamped harmonic 
oscillator, plotted with wọ = 1 and y = 0.5. 


There is a standard method to figure out y(t). Firstly, we introduce the (inverse) 
Fourier transform 


We plug this into the equation of motion (4.20) to get 


TOP ales 


+00 
27 il dt’[-w? — iqw + wigle WOM yw) F(t!) = F(t) 


which is solved if the f dw gives a delta function. But since we can write a delta 
function as 276(t) = f dwe™*, that can be achieved by simply taking 
1 


= 4.22 
xlo) —w? — iyw + we eee) 


There’s a whole lot of simple physics sitting in this equation which we'll now take some 
time to extract. All the lessons that we’ll learn carry over to more complicated systems. 


Firstly, we can look at the susceptibility, meaning x(w = 0) = 1/u2. This tells us 
how much the observable changes by a perturbation of the system, i.e. a static force: 
x = F'/w? as expected. 


Let’s look at the structure of the response function on the complex w-plane. The 
poles sit at w? + iyw, — w = 0 or, solving the quadratic, at 


W, = -2 + yu- 7/4 


There are two different regimes that we should consider separately, 
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Figure 12: The imaginary, dissipative part of the response function for the underdamped 
harmonic oscillator, plotted with wọ = 1 and y = 0.5. 


e Underdamped: wê > 7/4. In this case, the poles have both a real and imag- 
inary part. They both sit on the lower half plane. This is in agreement with 
our general lesson of causality which tells us that the response function must be 
analytic in the upper-half plane 


e Overdamped: wf < 7/4. Now the poles lie on the negative imaginary axis. 
Again, there are none in the upper-half place, consistent with causality. 


We can gain some intuition by plotting the real and imaginary part of the response 


function for w € R. Firstly, the real part is shown in Figure 11 where we plot 
we — w? 


(Rw Pw? 


Rex(w) = (4.23) 


This is the reactive part. The higher the function, the more the system will respond 
to a given frequency. Notice that Rey(w) is an even function, as expected. 


More interesting is the dissipative part of the response function, 


wy 
Imy(w) aera (4.24) 
This is an odd function. In the underdamped case, this is plotted in Figure 12. Notice 
that Imy is proportional to y, the coefficient of friction. The function peaks around 
+wpo, at frequencies where the system naturally vibrates. This is because this is where 
the system is able to absorb energy. However, as y —> 0, the imaginary part doesn’t 
become zero: instead it tends towards two delta functions situated at +woọ. 
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4.2.2 Dissipation 


We can see directly how Imx(w) is related to dissipation by computing the energy 
absorbed by the system. This what we used to call the work done on the system before 
we became all sophisticated and grown-up. It is 


P F(t)i(t) 
-FOSS dt' y(t — t') F(t’) 
= ro f at i = je ey (are) 
=f E iwl e Ru) (4.25) 


Let’s drive the system with a force of a specific frequency 2, so that 
F(t) = Fo cos Ot = Fy Re(e7™*) 


Notice that it’s crucial to make sure that the force is real at this stage of the calculation 
because the reality of the force (or source) was the starting point for our discussion 
of the analytic properties of response functions in section 4.1.2. In a more pedestrian 
fashion, we can see that it’s going to be important because our equation above is not 
linear in F(w), so it’s necessary to take the real part before progressing. Taking the 
Fourier transform, the driving force is 


F(w) = 20 Fo [5(w — 2) + d(w + Q)] 


Inserting this into (4.25) gives 


d r ; : R 
a = —iF2Q ae — x(—Q)et™)] le Sa e] (4.26) 


This is still oscillating with time. It’s more useful to take an average over a cycle, 


dw Q 2/2 aw 
a = | at = —1F2Ox(O) — x(-2)] 
0 


dt — Qn dt 


But we’ve already seen that Rey(w) is an even function, while Imy(w) is an odd 
function. This allows us to write 
dW 


a 2F2Q Imy(Q) (4.27) 
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We see that the work done is proportional to Imy. To derive this result, we didn’t 
need the exact form of the response function; only the even/odd property of the 
real/imaginary parts, which follow on general grounds. For our damped harmonic 
oscillator, we can now use the explicit form (4.24) to derive 


a0? 
(wa — 07)? + (79)? 


—- =2F3 


This is a maximum when we shake the harmonic oscillator at its natural frequency, 
Q = wo. As this example illustrates, the imaginary part of the response function tells 
us the frequencies at which the system naturally vibrates. These are the frequencies 
where the system can absorb energy when shaken. 


4.2.3 Hydrodynamic Response 


For our final classical example, we’ll briefly return to the topic of hydrodynamics. One 
difference with our present discussion is that the dynamical variables are now functions 
of both space and time. A typical example that we’ll focus on here is the mass density, 
p(Z,t). Similarly, the driving force (or, in the context of quantum mechanics, the 
source) is similarly a function of space and time. 


Rather than playing at the full Navier-Stokes equation, here we’ll instead just look 
at a simple model of diffusion. The continuity equation is 


We’ll write down a simple model for the current, 
J=-DVp+F (4.28) 
where D is the diffusion constant and the first term gives rise to Fick’s law that we met 


already in Section 1. The second term, F=F (Z,t), is the driving force. . Combining 
this with the continuity equation gives, 


DY pee oF (4.29) 
We want to understand the response functions associated to this force. This includes 


both the response of p and the response of J, 
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For simplicity, let’s work in a single spatial dimension so that we can drop the vector 
indices. We write 


plx, t) = [eae Vole tere @) 
Jn) = fow Viet t)E(x', t) 


where we’ve called the second label J on both of these functions to reflect the fact that 
F is a driving force for J. We follow our discussion of Section 4.1.1. We now assume 
that our system is invariant under both time and space translations which ensures that 
the response function depend only on t — t and z'— x. We then Fourier transform with 
respect to both time and space. For example, 


p(w,t) = [teat pte, t) 


Then in momentum and frequency space, the response functions become 


The diffusion equation (4.29) immediately gives an expression for y,,;. Substituting 
the resulting expression into (4.28) then gives us v7. The response functions ar 


E ik E —iw 
o iw + DR? iwp De 
Both of the denominator have poles on the imaginary axis at w = —iDk?. This is the 


characteristic behaviour of response functions capturing diffusion. 


Our study of hydrodynamics in Sections 2.4 and 2.5 revealed a different method of 
transport, namely sound. For the ideal fluid of Section 2.4, the sound waves travelled 
without dissipation. The associated response function has the form 


1 
Xsound ™ 7575 
w? — y2k? 


which is simply the Green’s function for the wave equation. If one includes the effect of 
dissipation, the poles of the response function pick up a (negative) imaginary part. For 
sound waves in the Navier-Stokes equation, we computed the location of these poles in 
(2.76). 
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4.3 Quantum Mechanics and the Kubo Formula 


Let’s now return to quantum mechanics. Recall the basic set up: working in the 
Heisenberg picture, we add to a Hamiltonian the perturbation 

Hoource(t) = 6,(8)O;(t) (4.30) 
where there is an implicit sum over 7, labelling the operators in the theory and, corre- 
spondingly, the different sources that we can turn on. Usually in any given situation we 
only turn on a source for a single operator, but we may be interested in how this source 
affects the expectation value of any other operator in the theory, (O;). However, if we 
restrict to small values of the source, we can address this using standard perturbation 
theory. We introduce the time evolution operator, 


t 
G4 \ =P em (-: f Halt) 
to 


which is constructed to obey the operator equation idU/dt = Hsource U. Then, switching 
to the interaction picture, states evolve as 


|h(t)) 1 = U(t, to) |W (to))z 


We’ll usually be working in an ensemble of states described by a density matrix p. If, 
in the distant past t — oo, the density matrix is given by po, then at some finite time 
it evolves as 


p(t) = U(t)poU~*(t) 
with U(t) = U(t, tp > —oo). From this we can compute the expectation value of any 
operator O; in the presence of the sources ¢. Working to first order in perturbation 
theory (from the third line below), we have 


(Oi(t) |g = Tr p()O:(t) 
= Tr (tU (HOUE) 


~ Te polt) (+i fal owel) O) 
= (OOla +i f 


t 


dt! ([Hyource(t!), OON) +- 


Inserting our explicit expression for the source Hamiltonian gives the change in the 
expectation value, 6(O;) = (O;)4 — (Oi) d=0; 


5(O;) <4 f dt’ (OE), OON AE) 


=i f ae ole e) (Oe), ON 6,0) (4.31) 


OO 
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where, in the second line, we have done nothing more than use the step function to 
extend the range of the time integration to +oo. Comparing this to our initial definition 
given in (4.4), we see that the response function in a quantum theory is given by the 
two-pont function, 


Xij(t — t’) = —10(¢ — t') ([Oi(t), O;(#’)]) (4.32) 


This important result is known as the Kubo formula. (Although sometimes the name 
“Kubo formula” is restricted to specific examples of this equation which govern trans- 
port properties in quantum field theory. We will derive these examples in Section 4.4). 


4.3.1 Dissipation Again 


Before we make use of the Kubo formula, we will first return to the question of dis- 
sipation. Here we repeat the calculation of 4.2.2 where we showed that, for classical 
systems, the energy absorbed by a system is proportional to Im y. Here we do the same 
for quantum systems. The calculation is a little tedious, but worth ploughing through. 


As in the classical context, the work done is associated to the change in the energy 
of the system which, this time, can be written as 


dW d : : 


To compute physical observables, it doesn’t matter if we work in the Heisenberg or 
Schrodinger picture. So lets revert momentarily back to the Schrodinger picture. Here, 
the density matrix evolves as ip = |H, p], so the first term above vanishes. Meanwhile, 
the Hamiltonian H changes because we’re sitting there playing around with the source 
(4.30), providing an explicit time dependence. To simplify our life, we’ll assume that 
we turn on just a single source, œ. Then, in the Schrodinger picture 


H = O6(t) 
This gives us the energy lost by the system, 
dw ’ 
Toi Tr(pO $) = (0) 6% = [(O) g=0 + 6(O)] ¢ 


We again look at a periodically varying source which we write as 


g(t) = Re(po e-™) 
and we again compute the average work done over a complete cycle 


dw gQ [ow dW 
dt 2r Jo dt 
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The term (O(Z)), cancels out when integrated over the full cycle. This leaves us with 


dw Q 27 /Q +00 7 
Ww _ Q dt f dt’ y(t — t) (t’) o(t) 


dt 2T Jo m 
Q 2r/Q +00 d l f 
Z a af dt’ lE (eae er) 
27 Jo Lees 27 
1 OH! Gi ; , 
x7 | doe 4 gjet | | - iNe as iNeger*| 
1 


z (9) — x(-Q)] Igo)? 40 


where the ¢* and ¢*? terms have canceled out after performing the f dt. Continuing, 
we only need the fact that the real part of x is even while the imaginary part is odd. 
This gives us the result 


TE = 52" (ldo? (4.33) 
Finally, this calculation tells us about another property of the response function. If we 
perform work on a system, the energy should increase. This translates into a positivity 
requirement Q x”(Q) > 0. More generally, the requirement is that 0y7,(Q) is a positive 


definite matrix. 


Spectral Representation 


In the case of the damped harmonic oscillator, we saw explicitly that the dissipation 
was proportional to the coefficient of friction, y. But for our quantum systems, the 
dynamics is entirely Hamiltonian: there is no friction. So what is giving rise to the 
dissipation? In fact, the answer to this can also be found in our analysis of the harmonic 
oscillator, for there we found that in the limit y — 0, the dissipative part of the response 
function y” doesn’t vanish but instead reduces to a pair of delta functions. Here we 
will show that a similar property holds for a general quantum system. 


We’ll take the state of our quantum system to be described by a density matrix 
describing the canonical ensemble, p = e~°". Taking the Fourier transform of the 
Kubo formula (4.32) gives 


xij(w) = —i ~ dt e Tr (eO), o;(0)) 


0 
We will need to use the fact that operators evolve as O;(t) = U~!O;(0)U with U = e~## 
and will evaluate x;;(w) by inserting complete basis of energy states 


xj) = —i l de E a E a a Fo 
0 


mn 
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—(m|O; In) (n|O; lm) ei Bn— Em) 


To ensure that the integral is convergent for t > 0, we replace w > w + ie. Then 
performing the integral over f dt gives 


xiv + ie) = De [et ge B= Bae 


m,n 


_ (Oi)mn(Oj)nm —Emb _ „—Enb 
Daa R 


which tells us that the response function has poles just below the real axis, 
w = En — Em — ic 


Of course, we knew on general grounds that the poles couldn’t lie in the upper half- 
plane: we see that in a Hamiltonian system the poles lie essentially on the real axis (as 
€ — 0) at the values of the frequency that can excite the system from one energy level 
to another. In any finite quantum system, we have an isolated number of singularities. 


As in the case of the harmonic oscillator, in the limit € — 0, the imaginary part of 
the response function doesn’t disappear: instead it becomes a sum of delta function 
spikes 


1 € 
~ — ô (w — (En — Em 
i > uri EA >, g= )) 

The expression above is appropriate for quantum systems with discrete energy levels. 
However, in infinite systems — and, in particular, in the quantum field theories that 
we turn to shortly — these spikes can merge into smooth functions and dissipative 


behaviour can occur for all values of the frequency. 


4.3.2 Fluctuation-Dissipation Theorem 


We have seen above that the imaginary part of the response function governs the 
dissipation in a system. Yet, the Kubo formula (4.32) tells us that the response formula 
can be written in terms of a two-point correlation function in the quantum theory. 
And we know that such two-point functions provide a measure of the variance, or 
fluctuations, in the system. This is the essence of the fluctuation-dissipation theorem 
which we'll now make more precise. 
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First, the form of the correlation function in (4.32) — with the commutator and 
funny theta term — isn’t the simplest kind of correlation we could image. The more 
basic correlation function is simply 


Sig(t) = (Oi(t)O;(0)) 


where we have used time translational invariance to set the time at which O; is evalu- 
ated to zero. The Fourier transform of this correlation function is 


Sylw) = fu e™t Si;(t) (4.34) 


The content of the fluctuation-dissipation theorem is to relate the dissipative part of 
the response function to the fluctuations S(w) in the vacuum state which, at finite 
temperature, means the canonical ensemble p = e~8”. 


There is a fairly pedestrian proof of the theorem using spectral decomposition (i.e. 
inserting a complete basis of energy eigenstates as we did in the previous section). Here 
we instead give a somewhat slicker proof although, as we will see, it requires us to do 
something fishy somewhere. We proceed by writing an expression for the dissipative 
part of the response function using the Kubo formula (4.32), 

xue) = —5 blt) — xal) 


-20l K000) - (000: 


! | 
+50(-t) |(O;(-#)0;(0)) = (2;(0)0;(-2)| 


By time translational invariance, we know that (O;(0)O;(t)) = (O;(—t)O,(0)). This 
means that the step functions arrange themselves to give 0(t) + 0(—t) = 1, leaving 


xig(t) = —5(Oi(t)O;(0)) + 5 (O;(—t)O;(0)) (4.35) 


But we can re-order the operators in the last term. To do this, we need to be sitting 
in the canonical ensemble, so that the expectation value is computed with respect to 
the Boltzmann density matrix. We then have 


(O;(—t)O;(0)) = Tr e~°"O,(—4)0;(0) 
= Tre °#0,(—t)e?# e F# O,(0) 
= Tr e°7#0,(0)O;(— t + if) 
= (O;(t — 78)O;(0)) 


2/97 = 


The third line above is where we’ve done something slippery: we’ve treated the density 
matrix p = e-° as a time evolution operator, but one which evolves the operator in 
the imaginary time direction! In the final line we’ve used time translational invariance, 
now both in real and imaginary time directions. While this may look dodgy, we can 
turn it into something more palatable by taking the Fourier transform. The dissipative 
part of the response function can be written in terms of correlation functions as 


1 


xi) = -3 | il) 0;(0)) — (Oilt — 1B) O;(0)) (4.36) 


Taking the Fourier transform then gives us our final expression: 


gig E EO (4.37) 


This is the fluctuation-dissipation theorem, relating the fluctuations in frequency space, 
captured by S(w), to the dissipation, captured by x”(w). Indeed, a similar relationship 
holds already in classical physics; the most famous example is the Einstein relation 
that we met in Section 3.1.3. 


The physics behind (4.37) is highlighted a little better if we invert the equation. We 
can write 


5i3(w) = —2[nplw) + 1] Xij (w) 


where ng(w) = (e% — 1)! is the Bose-Einstein distribution function. Here we see 
explicitly the two contributions to the fluctuations: the ng(w) factor is due to thermal 
effects; the “+1” can be thought of as due to inherently quantum fluctuations. As usual, 
the classical limit occurs for high temperatures with Bw < 1 where np(w) ~ kpT/w. 
In this regime, the fluctuation dissipation theorem reduces to its classical counterpart 


Qk pT 
Sylw) = xk (w) 


W 


4.4 Response in Quantum Field Theory 


We end these lectures by describing how response theory can be used to compute some 
of the transport properties that we’ve encountered in previous sections. To do this, 
we work with Quantum Field Theory where the operators become functions of space 
and time, O(7,t). In the context of condensed matter, this is the right framework 
to describe many-body physics. In the context of particle physics, this is the right 
framework to describe everything. 
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Suppose that you take a quantum field theory, place it in a state with a finite amount 
of stuff (whatever that stuff is) and heat it up. What is the right description of the 
resulting dynamics? From our earlier discussion, we know the answer: the low-energy 
excitations of the system are described by hydrodynamics, simply because this is the 
universal description that applies to everything. (Actually, we’re brushing over some- 
thing here: the exact form of the hydrodynamics depends on the symmetries of the 
theory, both broken and unbroken). All that remains is to identify the transport co- 
efficients, such as viscosity and thermal conductivity, that arise in the hydrodynamic 
equations. But how to do that starting from the quantum field? 


The answer to this question lies in the machinery of linear response that we developed 
above. For a quantum field, we again add source terms to the action, now of the form 


Heource(t) = | d1 ¢;(Z,t)O:(Z, t) (4.38) 


The response function y is again defined to be the change of the expectation values of 
O in the presence of the source ¢, 


5(O,(z,t)) = | d@x’dt!' xi;(2,t,2',t') olt 4.39) 
3I J 


All the properties of the response function that we derived previously also hold in the 
context of quantum field theory. Indeed, for the most part, the label 7 and #’can be 
treated in the same way as the label i, j. Going through the steps leading to the Kubo 
formula (4.32), we now find 


xij (@, Bt — t) = —i0(t — tO, t), O(t (4.40) 


We learned in our first course on Quantum Field Theory that the two-point functions 
are Green’s functions. Usually, when thinking about scattering amplitudes, we work 
with time-ordered (Feynman) correlation functions that are relevant for building per- 
turbation theory. Here, we interested in the retarded correlation functions, characterised 
by the presence of the step function sitting in front of (4.40). 


Finally, if the system exhibits translational invariance in both space and time, then 
the response function depends only on the differences t— t and #—Z’. In this situation 


it is useful to work in momentum and frequency space, so that the (4.39) becomes 


5(O;(k, w) = xulk w) p(k, w) (4.41) 
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Electrical Conductivity 


Consider a quantum field theory with a U(1) global symmetry. By Noether’s theorem, 
there is an associated conserved current J” = (J°, J‘), obeying ô, J” = 0. This current 
is an example of a composite operator. It couples to a source which is a gauge field 
A, (x ), 


Agrees = Jen ApJ” (4.42) 


Here A,, is the background gauge field of electromagnetism. However, for the purposes 
of our discussion, we do not take A, to have dynamics of its own. Instead, we treat it 
as a fixed source, under our control. 


There is, however, a slight subtlety. In the presence of the background gauge field, 
the current itself may be altered so that it depends on A,. A simple, well known, 
example of this occurs for a free, relativistic, complex scalar field y. The conserved 
current in the presence of the background field is given by 


J” = iejo" y — ("pije — eA" yip (4.43) 
where e is the electric charge. With this definition, the Lagrangian can be written in 
terms of covariant derivatives D y = ô p — ie Auo, 


L= pers lA,e/? + ApJ” = pers IDo]? (4.44) 


For non-relativistic fields (either bosons or fermions), similar A, terms arise in the 
current for the spatial components. 


We want to derive the response of the system to a background electric field. Which, 
in more basic language, means that we want to derive Ohm’s law in our quantum field 
theory. This is 


(Ji(k,w)) = oy (k, w) E(k, w) (4.45) 


Here E; is the background electric field in Fourier space and o;; is the conductivity 
tensor. In a system with rotational and parity invariance (which, typically means in 
the absence of a magnetic field) we have gi; = 06;;, so that the current is parallel to 
the applied electric field. Here we will work with the more general case. Our goal is to 
get an expression for g;j in terms of correlation functions in the field theory. Applying 
(4.41) with the perturbation (4.42), we have 


(Ja) = (Ju) = (Jao = -i f dda ([J (E, t), JZ’, t )])o A(z’) (4.46) 
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The subscript 0 here means the quantum average in the state A, = 0 before we turn 
on the background field. Let’s start by looking at the term (J;)o. You might think that 
there are no currents before we turn on the background field. But, in fact, the extra 
term in (4.43) gives a contribution even if — as we’ll assume — the unperturbed state 
has no currents. This contribution is 


(Jijo = € Ailylp)o = eAip 


where p is the background charge density. Notice it is not correct to set A; = 0 in this 
expression; the subscript 0 only means that we are evaluating the expectation value in 
the A; = 0 quantum state. 


Let’s now deal with the right-hand side of (4.46). If we work in Ap = 0 gauge (where 
things are simplest), the electric field is given by FE; = —A;. In Fourier transform space, 
this becomes 


A;(w) = Z (4.47) 


We can now simply Fourier transform (4.46) to get it in the form of Ohm’s law (4.45). 
The conductivity tensor has two contributions: the first from the background charge 
density; the second from the retarded Green’s function 


(4.48) 


with the Fourier transform of the retarded Green’s function given in terms of the 
current-current correlation function 


vij(k,w) = —i | j dt BZ 0(t) *-*) E, t), J; (0,0) 


OO 


This is the Kubo formula for conductivity. 


Viscosity 


We already saw in Section 2 that viscosity is associated to the transport of momentum. 
And, just as for electric charge, momentum is conserved. For field theories that are 
invariant under space and time translations, Noether’s theorem gives rise to four cur- 
rents, associated to energy and momentum conservation. These are usually packaged 
together into the stress-energy tensor T””, obeying 0,7” = 0. (We already met this 
object in a slightly different guise in Section 2, where the spatial components appeared 
as the pressure tensor P;; and the temporal components as the overall velocity u;). 
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The computation of viscosity in the framework of quantum field theory is entirely 
analogous to the computation of electrical conductivity. The electric current is simply 
replaced by the momentum current. Indeed, as we already saw in Section 2.5.3, the 
viscosity tells us the ease with which momentum in, say, the x-direction can be trans- 
ported in the z-direction. For such a set-up, the relevant component of the current is 
T**. The analog of the formula for electrical conductivity can be re-interpreted as a 
formula for viscosity. There are two differences. Firstly, there is no background charge 
density. Secondly, the viscosity is for a constant force, meaning that we should take 
the w > 0 and k — 0 limit of our equation. We have 


Yez,02(k,w) = —i / dt ËZ 0(t) eiwt-k-z) ((Tr2(Z, t), Tr2(0, 0)]) 


and 


= fy Xäzäz 0, w) 
" w—>0 iw 


This is the Kubo formula for viscosity. 
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